22 Sep 2024 |
maralorn | Oh, God. You are saying that is triggered by my job iPad. 🫣 | 12:25:58 |
hexa | no, you are saying that | 12:26:24 |
23 Sep 2024 |
| @nis:tchncs.de left the room. | 11:35:59 |
| elikoga joined the room. | 15:29:52 |
| nikolaiser joined the room. | 20:51:09 |
25 Sep 2024 |
| Autumn changed their display name from luna-null to Autumn. | 06:39:20 |
| @beat_link:matrix.org left the room. | 12:39:30 |
hexa | llama3.2:3b-instruct-q8_0 works great for home-assistant | 21:55:27 |
26 Sep 2024 |
adamcstephens | ooo, 3.2 | 00:00:23 |
adamcstephens | looks like it's fairly well tuned for home assistant type workloads | 00:18:59 |
| Fabián Heredia set a profile picture. | 01:16:00 |
bbigras | Does it work with function calling? | 04:14:29 |
hexa | yes | 07:14:58 |
| Mackeveli joined the room. | 14:02:43 |
mattleon | May I ask what hardware you are running inferencing on and how many tokens per second it generates?
It seems the hardware I'm running hass on can do about 3 tokens/s, trying to figure out if that's enough | 14:17:50 |
hexa | In reply to @mattleon:matrix.org May I ask what hardware you are running inferencing on and how many tokens per second it generates?
It seems the hardware I'm running hass on can do about 3 tokens/s, trying to figure out if that's enough 3060 12GB | 14:47:07 |
hexa | In reply to @hexa:lossy.network 3060 12GB How do I find out how many tokens? | 14:47:24 |
mattleon | That's a great question, I got my information from this medium post on the OG llama 3 with 8B parameters. The gif of the terminal output of running an inference indicates the number of tokens per second at the end:
https://medium.com/@benoit.clouet/running-llama3-on-the-gpu-of-a-rk1-turing-pi-6dddb9e14521
I do wonder if hass exposes a performance graph 🤔
In either case, a 3060 is quite a bit more powerful than what I'm working with | 14:54:15 |
K900 | The RK3588 has an NPU | 14:54:37 |
K900 | Which is presumably currently not used | 14:54:42 |
K900 | Because nothing supports it | 14:54:45 |
K900 | But work is ongoing on that | 14:54:53 |
mattleon | This reddit thread has some performance notes for various models and Nvidia cards, but I don't see llama with 3b parameters: https://www.reddit.com/r/LocalLLaMA/comments/13j5cxf/how_many_tokens_per_second_do_you_guys_get_with/ | 14:56:09 |
hexa | In reply to @mattleon:matrix.org This reddit thread has some performance notes for various models and Nvidia cards, but I don't see llama with 3b parameters: https://www.reddit.com/r/LocalLLaMA/comments/13j5cxf/how_many_tokens_per_second_do_you_guys_get_with/ I think the smallest model was 4 or 8b thus far | 15:07:51 |
Mackeveli | Is there some way of overriding cacert and adding private CAs to it in the music-assistant package like in the home-assistant package?
https://wiki.nixos.org/wiki/Home_Assistant#Trust_a_private_certificate_authority | 16:18:19 |
hexa | replace home-assistant with music-assisant | 17:09:33 |
hexa | both of them consume certifi, the package handing out cacerts for python consumers | 17:09:48 |
Mackeveli | I tried that but the music-assistant package doesn't expose a packageOverrides variable like the home-assistant package does so I get this error upon doing a nixos-rebuild:
error: function 'anonymous lambda' called with unexpected argument 'packageOverrides'
at /nix/store/y6205wq8hxvpqvl8l9d1n9xah01kg0lq-source/pkgs/by-name/mu/music-assistant/package.nix:1:1:
1| { lib
| ^
2| , python3
| 17:27:06 |
hexa | ah, hm. | 17:27:46 |
hexa | you can pass an overriden cacert by passing the NIX_SSL_CERT_FILE environment variable to the systemd unit then | 17:28:54 |