NixOS CUDA | 290 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 57 Servers |
| Sender | Message | Time |
|---|---|---|
| 30 Jan 2025 | ||
| * I occasionally have some free time and I would like to spend it on improving the state of CUDA in nixpkgs/NixOS. Do you have any suggestions for which issues I should start with? Alternatively, I could just start going down the list of eval/build failures on the nix-community CUDA builder... | 16:24:46 | |
| * I occasionally have some free time and I would like to spend it on improving the state of CUDA in nixpkgs/NixOS. Do you have any suggestions for which issues I should start with? I am not a complete newbie, but I haven't contributed to nixpkgs all that much (above PR is my third). I guess, I could just start going down the list of eval/build failures on the nix-community CUDA builder... but that might end up just being treadmill work. | 16:28:22 | |
| * I occasionally have some free time and I would like to spend it on improving the state of CUDA in nixpkgs/NixOS. Do you have any suggestions for which issues I should start with? I am not a complete newbie, but I haven't contributed to nixpkgs all that much (the above-mentioned PR is my third). I guess, I could just start going down the list of eval/build failures on the nix-community CUDA builder... but that might end up just being treadmill work. | 16:47:05 | |
In reply to @glepage:matrix.orgI meant to close that since I’ll make a new PR for the CUDA-packages work | 16:47:34 | |
| Ok Connor. Do we have a short-term alternative to get this library ? | 16:48:28 | |
| Depends on what you mean by short term :( | 16:53:34 | |
| I should have everything landed by 25.05 but I suppose we’ll need it prior to that | 16:54:03 | |
| I guess I can start trying to land things, but it’ll cause some breakages and I don’t have docs written yet | 16:54:43 | |
| I'm asking for pytorch (https://github.com/NixOS/nixpkgs/pull/377785). There is no emergency and we can surely wait before updating it. | 16:56:39 | |
| Ugh didn’t they also remove support for CUDA 12.1 | 16:58:19 | |
| Also I think they support newer architectures now (maybe Blackwell?) | 16:58:36 | |
In reply to @connorbaker:matrix.orgFrom the CI at least: https://github.com/pytorch/pytorch/pull/141271, https://github.com/pytorch/pytorch/pull/142177 | 16:59:51 | |
| My bad I mixed up the CI removal and https://github.com/NVIDIA/TensorRT-Model-Optimizer/releases/tag/0.23.0 removing support for CUDA 11 | 17:02:04 | |
| 31 Jan 2025 | ||
In reply to @glepage:matrix.orgIt's separate https://developer.download.nvidia.com/compute/cusparselt/redist/ | 02:13:02 | |
| I am so tired But now have setup hooks which can catch common issues like the order of different CUDA directories in a run path Or fail a build if NVCC’s host compiler leaks out (which can/will cause glibc/glibcxx symbol issues) Even beyond that I implemented utility functions for arrays and associative arrays in bash because I got tired of repeating myself in different hooks And then when I got tired of repeating myself in tests for those functions and hooks, I made a utility derivation to make testing for expected arrays and associative arrays easier | 06:55:57 | |
| It’s still a mess but it’s on this branch if anyone is curious https://github.com/ConnorBaker/cuda-packages/compare/main...fix/runpath-order-matters-and-cuda-compat-gets-clobbered | 06:56:57 | |
| Let's schedule a call to discuss how to go forward with stdenv support, setup-hooks, wrappers, CC connor (he/him) (UTC-7), sielicki, Samuel Ainsworth, and anyone interested | 10:49:09 | |
| 19:11:42 | ||
| 19:35:00 | ||
| 1 Feb 2025 | ||
| 09:41:01 | ||
| 2 Feb 2025 | ||
| 16:04:38 | ||
| 18:23:02 | ||
| 3 Feb 2025 | ||
| 08:23:12 | ||
| 09:11:41 | ||
| 13:40:41 | ||
| connor (he/him) (UTC-7): SomeoneSerge (Gand St. Pieters) sorry to keep annoying you guys, but could you respond to the above question? Alternatively, "we are too busy right now, you'll have to figure it out on your own" is also an acceptable answer))) | 14:37:45 | |
| Sorry, I forgot to reply. I'll write before tomorrow | 14:41:33 | |
| ❤️ | 14:42:10 | |
| 16:25:49 | ||
| Starting with the last question: great to hear! As one tool to help with discovery, we have a task board at https://github.com/orgs/NixOS/projects/27/views/1. We haven't been properly maintaining it for the last year, I see many invalidated/outdated items there, but some of the roadmap is still relevant, and the "New" column is automatically populated with all issues and PRs tagged "cuda". If you're willing to do chores, fixing issues like "nvidia's bash wrapper for nsys-ui assumes things are installed into weird locations and is completely broken" and "a package has changed the way they hardcode /usr/lib or dlopen stuff and now fails to find libcuda.so again", those would be very useful, relatively straightforward, but involve an amount of debugging and suffering and usually get ignored for a long time because it's just demotivating. If you're interested in architectural issues, then note the message about the upcoming meeting and the proposed subjects, check out the "Roadmap" column, and Connor's out-of-tree cuda-packages | 22:27:33 | |