NixOS CUDA - Public Room Timeline

	NixOS CUDA	289 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

You have reached the beginning of time (for this room).

Sender	Message	Time
16 Nov 2024
Alexandros Liarokapis	* Got any resources I can look into?	08:06:15
Alexandros Liarokapis	Actially I think the wiki page has enough info to get me started	08:06:31
Alexandros Liarokapis	* Actually I think the wiki page has enough info to get me started	08:06:41
Alexandros Liarokapis	.. or not, it is mainly nixos based.	08:15:50
Alexandros Liarokapis	i guess I may as well try it	08:16:07
hexa	`error: tensorflow-gpu-2.13.0 not supported for interpreter python3.12`	20:45:57
hexa	the sound of nixos 24.05 hits hard	20:46:03
hexa	* `error: tensorflow-gpu-2.13.0 not supported for interpreter python3.12`	20:46:08
hexa	* `error: tensorflow-gpu-2.13.0 not supported for interpreter python3.12`	20:46:12
17 Nov 2024
Gaétan Lepage	Yes... Let's hope zeuner finds the time to end the TF bump...	10:38:39
18 Nov 2024
hexa	wyoming-faster-whisper[4505]: File "/nix/store/dfp38l0dy3n97wvrgz5i62mwvsmshd3n-python3.12-faster-whisper-unstable-2024-07-26/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 145, in __init__ wyoming-faster-whisper[4505]: self.model = ctranslate2.models.Whisper( wyoming-faster-whisper[4505]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ wyoming-faster-whisper[4505]: RuntimeError: CUDA failed with error unknown error systemd[1]: wyoming-faster-whisper-medium-en.service: Main process exited, code=exited, status=1/FAILURE	02:09:21
hexa	also loving unknown error errors	02:09:26
hexa	`wyoming-faster-whisper[4745]: File "/nix/store/dfp38l0dy3n97wvrgz5i62mwvsmshd3n-python3.12-faster-whisper-unstable-2024-07-26/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 145, in __init__ wyoming-faster-whisper[4745]: self.model = ctranslate2.models.Whisper( wyoming-faster-whisper[4745]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ wyoming-faster-whisper[4745]: RuntimeError: CUDA failed with error no CUDA-capable device is detected`	02:10:44
hexa	baby steps	02:10:46
hexa	I can confirm the card is still seated correctly 😄	02:10:58
hexa	hardening at work	02:18:46
connor (burnt/out) (UTC-8)	Ugh I don’t like computers	05:10:46
connor (burnt/out) (UTC-8)	Anyway in the interest of splitting my attention ever more thinly I decided to start trying to work on some approach toward evaluation of derivations and building them The idea being to have a service which is given a flake ref and an attribute path and efficiently produces a list of attribute paths to derivations exiting under the given attribute path and stores the eval time somewhere a service which is given a flake ref and an attribute path to a derivation and produces the JSON representation of the closure of derivations required to realize the derivation, again storing eval time somewhere a service which functions as a job scheduler, using historical data about costs (space, time, memory, CPU usage, etc.) and information about locality (existing store paths on different builders) to realize a derivation, which is updated upon realization of a derivation	05:18:41
connor (burnt/out) (UTC-8)	Because why have one project when you can have many?	05:18:55
connor (burnt/out) (UTC-8)	https://github.com/ConnorBaker/nix-eval-graph And I’ve decided to write it in Rust, which I am self teaching. And I’ll probably use a graph database, because why not. And I’ll use NixOS tests for integration testing, because also why not.	05:20:02
connor (burnt/out) (UTC-8)	All this is to say I am deeply irritated when I see my builders copying around gigantic CUDA libraries constantly.	05:20:31
connor (burnt/out) (UTC-8)	Unrelated to closure woes, I tried to package https://github.com/NVIDIA/MatX and https://github.com/NVIDIA/nvbench and nearly pulled my hair out. If anyone has suggestions for doing so without creating a patched and vendored copy of https://github.com/rapidsai/rapids-cmake or writing my own CMake for everything, I’d love to hear!	05:23:26
connor (burnt/out) (UTC-8)	Also, anyone know how the ROCm maintainers are doing?	05:26:35
SomeoneSerge (back on matrix)	In reply to @connorbaker:matrix.org Anyway in the interest of splitting my attention ever more thinly I decided to start trying to work on some approach toward evaluation of derivations and building them The idea being to have a service which is given a flake ref and an attribute path and efficiently produces a list of attribute paths to derivations exiting under the given attribute path and stores the eval time somewhere a service which is given a flake ref and an attribute path to a derivation and produces the JSON representation of the closure of derivations required to realize the derivation, again storing eval time somewhere a service which functions as a job scheduler, using historical data about costs (space, time, memory, CPU usage, etc.) and information about locality (existing store paths on different builders) to realize a derivation, which is updated upon realization of a derivation Awesome! I've been bracing myself to look into that too. What's your current idea regarding costs and locality?	07:09:42
SomeoneSerge (back on matrix)	In reply to @connorbaker:matrix.org Unrelated to closure woes, I tried to package https://github.com/NVIDIA/MatX and https://github.com/NVIDIA/nvbench and nearly pulled my hair out. If anyone has suggestions for doing so without creating a patched and vendored copy of https://github.com/rapidsai/rapids-cmake or writing my own CMake for everything, I’d love to hear! we'd need to do that if were to package rapids itself too, wouldn't we?	07:11:11

Show newer messages

Back to Room ListRoom Version: 9