NixOS CUDA - Public Room Timeline

	NixOS CUDA	336 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	64 Servers

Load older messages

Sender	Message	Time
24 May 2026
hexa	https://isaiprofitable.com/ lmao	16:40:49
hexa	well played, nvidia	16:40:59
Gaétan Lepage	Yup, will do.	19:30:48
Gaétan Lepage	https://github.com/nixos-cuda/hydra-jobsets/pull/31	20:13:09
27 May 2026
Gaétan Lepage	If anyone has a decent modern GPU to test the flash-attention tests, please ping me. The CUDA team's infra is not sufficent: python3.13-flash-attention> FAILED tests/losses/test_cross_entropy.py::test_cross_entropy_loss[128256-0.9-0.7-True-0.01-True-False-dtype2] - torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1002.00 MiB. GPU 0 has a total capacity of 19.55 GiB of which 360.38 MiB is free. Including non-PyTorch memory, this process has 19.19 GiB memory in use. Of the allocated memory 18.10 GiB is allocated by PyTorch, and 925.39 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://docs.pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf) Thanks in advance for your generosity	15:24:45
BerriJ	Would an RTX 6000 Pro with 96GB VRAM be okay? If yes I could run these test but I would need relatively detailed instructions. I'm running a flake based system based on nixos-unstable and I'm running the "latest" Nvidia drivers.	15:49:37
Gaétan Lepage	I'm pretty sure that would fit. Thanks a lot! You'd need to add the following to your config: `programs.nix-required-mounts = { enable = true; presets.nvidia-gpu.enable = true; };` Then `nix build github:GaetanLepage/nixpkgs/flash-attn#python3Packages.flash-attn.gpuCheck --cores 10`	16:00:23
Gaétan Lepage	Watch out for RAM consumption though. It's terribly hungry. I need to set it to `15` max on a 128GB system.	16:01:16
Gaétan Lepage	Hmm. Wait, you need to set `cudaSupport`.	16:02:39
BerriJ	I could also jump into a dev shell if you provide me a flake if that's easier. Anyway I can try when I'm back home in about an hour. And the machine in question has 760gb of ram so we should be fine I guess 😇	16:04:05
hexa	in this economy?!	16:04:38
Gaétan Lepage	`nix build --impure --cores 2 --expr ' (import (builtins.getFlake "github:GaetanLepage/nixpkgs/flash-attn") { system = builtins.currentSystem; config = { allowUnfree = true; cudaSupport = true; }; }).python3Packages.flash-attn.gpuCheck '` This should do it.	16:05:28
Gaétan Lepage	* `nix build --impure --expr ' (import (builtins.getFlake "github:GaetanLepage/nixpkgs/flash-attn") { system = builtins.currentSystem; config = { allowUnfree = true; cudaSupport = true; }; }).python3Packages.flash-attn.gpuCheck '` This should do it.	16:05:59
BerriJ	In reply to @hexa:lossy.network in this economy?! It's not my private one unfortunately 😅 But I'm the admin and currently there is no workload on that thing.	16:07:57
Gaétan Lepage	I mean... If only I had nix installed... root@p4-r01-ct18:~# nvidia-smi Wed May 27 16:10:23 2026 +-----------------------------------------------------------------------------------------+ \| NVIDIA-SMI 580.126.21 Driver Version: 580.126.21 CUDA Version: 13.2 \| +-----------------------------------------+------------------------+----------------------+ \| GPU Name Persistence-M \| Bus-Id Disp.A \| Volatile Uncorr. ECC \| \| Fan Temp Perf Pwr:Usage/Cap \| Memory-Usage \| GPU-Util Compute M. \| \| \| \| MIG M. \| \|=========================================+========================+======================\| \| 0 NVIDIA GB200 On \| 00000008:01:00.0 Off \| 0 \| \| N/A 45C P0 170W / 1200W \| 0MiB / 189471MiB \| 0% Default \| \| \| \| Disabled \| +-----------------------------------------+------------------------+----------------------+ \| 1 NVIDIA GB200 On \| 00000009:01:00.0 Off \| 0 \| \| N/A 45C P0 153W / 1200W \| 0MiB / 189471MiB \| 0% Default \| \| \| \| Disabled \| +-----------------------------------------+------------------------+----------------------+ \| 2 NVIDIA GB200 On \| 00000018:01:00.0 Off \| 0 \| \| N/A 45C P0 153W / 1200W \| 0MiB / 189471MiB \| 0% Default \| \| \| \| Disabled \| +-----------------------------------------+------------------------+----------------------+ \| 3 NVIDIA GB200 On \| 00000019:01:00.0 Off \| 0 \| \| N/A 45C P0 176W / 1200W \| 0MiB / 189471MiB \| 0% Default \| \| \| \| Disabled \| +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ \| Processes: \| \| GPU GI CI PID Type Process name GPU Memory \| \| ID ID Usage \| \|=========================================================================================\| \| No running processes found \| +-----------------------------------------------------------------------------------------+	16:10:33
hexa	makes you wonder who we are building cuda support for	16:12:00
Gaétan Lepage	Not for the owners of those GPUs unfortunately 🥲	16:20:08
BerriJ	In my case I'm working at a German University and the server is used by a team of around 9 researchers :)	16:33:56
hexa	pretty sure Gaetan works at some French university 😆	16:37:26
Gaétan Lepage	Not anymore. (French universities don't have such fancy GPUs) 🫠	16:38:38
BerriJ	The build is running now :)	17:16:42
SomeoneSerge (matrix works sometimes)	Can't you nix in container?	18:26:52
SomeoneSerge (matrix works sometimes)	Not TUM?	18:27:57
SomeoneSerge (matrix works sometimes)	Not the OS group? I'd be hyped yo learn that somebody in academia/hpc/rse community actually uses nixpkgs cuda, because so far I've been getting the vibes that only the enterprise cares, and all these eurohpc/CSC/yada yada are completely unapproachable and dead set on their easybuild lmod workflows...	18:33:29
BerriJ	University of Duisburg-Essen, not TUM But it's really not that big of a deal. The economics faculty has its own little IT department, they bought some servers for machine learning of which our Chair was able to get one and we asked them to install nixos on that for us cause we use nixos since 2 years on all of our machines. That's essentially the full story, there is not that much support for NixOS besides me pushing it and my Boss seeing the advantages and sometimes proudly talking about our infra 😅	18:55:53
SomeoneSerge (matrix works sometimes)	Shooting in the dark but anything that could be done or reprioritized on our side to potentially help the lab's story?	19:48:31
BerriJ	Well the biggest point is the cache. Currently we obtain pytorch and other ml packages from pypi cause it has the CUDA binaries packaged directly. I we can't really risk getting cache misses and triggering a 5 hour recompilation on my colleagues machines. And setting up our own binary cache is also not trivial, we are working from home a lot and the machines are only connected to the university vpn on demand. I've read that there is this flox cache now, but I also read that this does not strictly follow nixos-unstable.	20:35:31
BerriJ	By the way the build is still running it's at the `pytestCheckPhase` of flash attention and causes a good 60gb of VRAM usage at the moment. I'll call it a day and report on the status tomorrow morning 🙂	20:37:16
	@busti:leitstelle511.net left the room.	21:16:57
28 May 2026
Gaétan Lepage	CUDA 13.3 is out: https://developer.nvidia.com/blog/nvidia-cuda-13-3-enhances-gpu-development-with-tile-programming-in-c-compiler-autotuning-and-python-updates/	07:16:08

Show newer messages

Back to Room ListRoom Version: 9