!QhvgabMQzwEQeWehhZ:lossy.network

NixOS Home Automation

501 Members
Declarative Home Automation and other Sidequests | https://wiki.nixos.org/wiki/Home_Assistant126 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
26 Sep 2024
@mattleon:matrix.orgmattleonMay I ask what hardware you are running inferencing on and how many tokens per second it generates? It seems the hardware I'm running hass on can do about 3 tokens/s, trying to figure out if that's enough14:17:50
@hexa:lossy.network@hexa:lossy.network
In reply to @mattleon:matrix.org
May I ask what hardware you are running inferencing on and how many tokens per second it generates?

It seems the hardware I'm running hass on can do about 3 tokens/s, trying to figure out if that's enough
3060 12GB
14:47:07
@hexa:lossy.network@hexa:lossy.network
In reply to @hexa:lossy.network
3060 12GB
How do I find out how many tokens?
14:47:24
@mattleon:matrix.orgmattleonThat's a great question, I got my information from this medium post on the OG llama 3 with 8B parameters. The gif of the terminal output of running an inference indicates the number of tokens per second at the end: https://medium.com/@benoit.clouet/running-llama3-on-the-gpu-of-a-rk1-turing-pi-6dddb9e14521 I do wonder if hass exposes a performance graph 🤔 In either case, a 3060 is quite a bit more powerful than what I'm working with14:54:15
@k900:0upti.meK900The RK3588 has an NPU14:54:37
@k900:0upti.meK900Which is presumably currently not used14:54:42
@k900:0upti.meK900Because nothing supports it14:54:45
@k900:0upti.meK900But work is ongoing on that14:54:53

Show newer messages


Back to Room ListRoom Version: 6