NixOS CUDA | 288 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 58 Servers |
| Sender | Message | Time |
|---|---|---|
| 21 Dec 2025 | ||
| Well, because we unfortunately are quite busy with a lot of maintenance work. It is hard to find some time to work on those more long-term projects. | 09:32:42 | |
In reply to @glepage:matrix.orgI'd be happy to help if there is anything I could do to speed this up. | 12:00:04 | |
| 15:50:06 | ||
| Hey guys, does anyone else have a setup with A100 (or some such) that require nvidia-fabricmanager? Could you maybe share with me the relevant .nix configuration bits? (If using a relatively modern NixOS - 25.05 or 25.11) hardware.nvidia.datacenter.enable=true produces for me a broken nv-fabricmanager with undefined symbols. I managed to make it work by packaging nvidia-fabricmanager myself, but it is a bit ugly and as a novice I am not sure everything is well done. If anyone has a configuration with nvidia-fabricmanager that could share with me that would be great...! | 16:05:31 | |
| A lot of the functionality gated behind datacenter-grade GPUs or multi-GPU setups is out of the reach of the maintainers at the moment as we’ve just recently been able to get a Hydra set up to build packages and run a few GPU checks. Part of the quick iteration time I’ve had in the past is because I own a 4090 and so can benchmark and test quickly. But for bigger stuff, the only approach I’ve had any luck with is using Lambda Labs to rent multi-GPU instances for fairly cheap and try Nix-built binaries on them. But that doesn’t test using NixOS as the host system or any other number of features unique to the hardware (or even specific code paths). If you have such hardware or have access to it, please don’t hesitate to open PRs. Access to hardware (among other things like time and burnout) are big blockers for us supporting more stuff. We can always coach or provide feedback on packaging! And we can certainly use such an opportunity to update (or make) contributing documents. | 19:01:11 | |
| 27 Dec 2025 | ||
| 14:42:47 | ||
| 28 Dec 2025 | ||
| Are nixpkgs built packages reliant on /run/opengl-driver to find libcuda.so.1 on non-nixos systems as well? | 23:53:35 | |
| I guess OpenGL-driver symlinkfarm is responsible for all driver-bound bridge libraries… | 23:57:15 | |
| 29 Dec 2025 | ||
| I was thinking there is an authoritative source of all needed kernel bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and auto implement a /run/opengl symlinker for nix standalone systems and oci containers | 00:07:52 | |
| * I was thinking there is an authoritative source of all needed kernel-driver bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and auto implement a /run/opengl symlinker for nix standalone systems and oci containers | 00:08:05 | |
| * I was thinking there is an authoritative source of all needed kernel-driver bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and implement an automatic :run/opengl symlinker for nix standalone systems and oci containers | 00:08:23 | |
| * I was thinking there is an authoritative source of all needed kernel-driver bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and implement an automatic /run/opengl symlinker for nix standalone systems and oci containers | 00:08:29 | |
| * I was thinking there is an authoritative source of all needed kernel-driver bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and implement an automatic /run/opengl l-driver symlinker for nix standalone systems and oci containers | 00:08:40 | |
| * I was thinking there is an authoritative source of all needed kernel-driver bound libraries and that is the nvidia container toolkit. One could mimic the search procedure and implement an automatic /run/opengl-driver symlinker for nix standalone systems and oci containers | 00:08:51 | |
| yeah, they are. most folks will use nixgl or nixglhost to work around this | 00:09:12 | |
| i use nixglhost wrappers for my binaries in oci containers | 00:09:25 | |
| I was thinking of the issues specified here https://github.com/soupglasses/nix-system-graphics#but-why-another-nix-with-opengl-project | 00:10:57 | |
| Now nix system graphics is a nice approach but has the limitation of having to specify the nvidia driver version in a system manager config to switch it is not automatic which really hurts usability | 00:13:06 | |
| I was thinking something more in the lines of actually generating the symlink farm populating with the proper host system library symlinks | 00:18:17 | |
| Would probably also work for mesa on nix standalone systems as well | 00:19:29 | |
| One could probably directly reuse the discovery functionality of the nvidia container toolkit go library | 00:29:52 | |
| * One could probably directly reuse the discovery functionality of the nvidia container toolkit go library (or flatboat or similar) | 00:49:05 | |
| * One could probably directly reuse the discovery functionality of the nvidia container toolkit go library (or flatpak or similar) | 00:49:13 | |
| Hi! I finally found some time to work on the llama-cpp-python-libcuda.so-stubs-missing bug and fixed outlines by skipping its pythonImportsCheckHook and tests when cudaSupport was enabled.This allowed me to finally unblock the vllm bump. It builds fine with and without cudaSupport. Feel free to test/review: https://github.com/NixOS/nixpkgs/pull/467418 | 17:36:41 | |
| Good post about the exact problem we run into when trying to create dynamic libraries for Magma with our default set of CUDA capabilities: https://fzakaria.com/2025/12/28/huge-binaries | 18:06:21 | |
| I won't be in the weekly call tomorrow btw | 18:06:33 | |
| 30 Dec 2025 | ||
| Hello, I'm working on getting a nixos module written for dcgm. connor (burnt/out) (UTC-8) I know you have an old draft PR so I'm basing a lot of it on that. As part of that I've had to go ahead and update dcgm and it's prometheus exporter. I would appreciate it if I could get a review from anyone here if they have the chance :) https://github.com/NixOS/nixpkgs/pull/474721 . dcgm is kind of a cursed package so I did my best with the patching but if anyone has any critical feedback it would be more than welcome. | 14:49:19 | |
| I don’t personally use it and will have limited time so I don’t think I can review it, but I’m fine with anyone else reviewing and merging it | 18:08:38 | |
| 31 Dec 2025 | ||
| Wow super cool, thank you. Haven't run benchmarks but it looks a lot faster | 13:58:23 | |
| Oh no worries. As far as the dcgm nix module I was mostly mentioning you above to make sure credit was given where credit is due :D | 16:21:47 | |