NixOS CUDA | 290 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 58 Servers |
| Sender | Message | Time |
|---|---|---|
| 30 Jan 2025 | ||
| Ok Connor. Do we have a short-term alternative to get this library ? | 16:48:28 | |
| Depends on what you mean by short term :( | 16:53:34 | |
| I should have everything landed by 25.05 but I suppose we’ll need it prior to that | 16:54:03 | |
| I guess I can start trying to land things, but it’ll cause some breakages and I don’t have docs written yet | 16:54:43 | |
| I'm asking for pytorch (https://github.com/NixOS/nixpkgs/pull/377785). There is no emergency and we can surely wait before updating it. | 16:56:39 | |
| Ugh didn’t they also remove support for CUDA 12.1 | 16:58:19 | |
| Also I think they support newer architectures now (maybe Blackwell?) | 16:58:36 | |
In reply to @connorbaker:matrix.orgFrom the CI at least: https://github.com/pytorch/pytorch/pull/141271, https://github.com/pytorch/pytorch/pull/142177 | 16:59:51 | |
| My bad I mixed up the CI removal and https://github.com/NVIDIA/TensorRT-Model-Optimizer/releases/tag/0.23.0 removing support for CUDA 11 | 17:02:04 | |
| 31 Jan 2025 | ||
In reply to @glepage:matrix.orgIt's separate https://developer.download.nvidia.com/compute/cusparselt/redist/ | 02:13:02 | |
| I am so tired But now have setup hooks which can catch common issues like the order of different CUDA directories in a run path Or fail a build if NVCC’s host compiler leaks out (which can/will cause glibc/glibcxx symbol issues) Even beyond that I implemented utility functions for arrays and associative arrays in bash because I got tired of repeating myself in different hooks And then when I got tired of repeating myself in tests for those functions and hooks, I made a utility derivation to make testing for expected arrays and associative arrays easier | 06:55:57 | |
| It’s still a mess but it’s on this branch if anyone is curious https://github.com/ConnorBaker/cuda-packages/compare/main...fix/runpath-order-matters-and-cuda-compat-gets-clobbered | 06:56:57 | |
| Let's schedule a call to discuss how to go forward with stdenv support, setup-hooks, wrappers, CC connor (he/him) (UTC-7), sielicki, Samuel Ainsworth, and anyone interested | 10:49:09 | |
| 19:11:42 | ||
| 19:35:00 | ||
| 1 Feb 2025 | ||
| 09:41:01 | ||
| 2 Feb 2025 | ||
| 16:04:38 | ||
| 18:23:02 | ||
| 3 Feb 2025 | ||
| 08:23:12 | ||
| 09:11:41 | ||
| 13:40:41 | ||
| connor (he/him) (UTC-7): SomeoneSerge (Gand St. Pieters) sorry to keep annoying you guys, but could you respond to the above question? Alternatively, "we are too busy right now, you'll have to figure it out on your own" is also an acceptable answer))) | 14:37:45 | |
| Sorry, I forgot to reply. I'll write before tomorrow | 14:41:33 | |
| ❤️ | 14:42:10 | |
| 16:25:49 | ||
| Starting with the last question: great to hear! As one tool to help with discovery, we have a task board at https://github.com/orgs/NixOS/projects/27/views/1. We haven't been properly maintaining it for the last year, I see many invalidated/outdated items there, but some of the roadmap is still relevant, and the "New" column is automatically populated with all issues and PRs tagged "cuda". If you're willing to do chores, fixing issues like "nvidia's bash wrapper for nsys-ui assumes things are installed into weird locations and is completely broken" and "a package has changed the way they hardcode /usr/lib or dlopen stuff and now fails to find libcuda.so again", those would be very useful, relatively straightforward, but involve an amount of debugging and suffering and usually get ignored for a long time because it's just demotivating. If you're interested in architectural issues, then note the message about the upcoming meeting and the proposed subjects, check out the "Roadmap" column, and Connor's out-of-tree cuda-packages | 22:27:33 | |
| OK one more item for the agenda: I think it would be good for us together to walk through the backlog, discuss issues' context, status, and present relevance, and sort/close outdated issues, maybe merge well-reviewed but forgotten PRs. I'd guess this is easily half an hour or more, should we schedule this separately? | 22:30:38 | |
| * OK one more item for the agenda: I think it would be good for us together to walk through the backlog, discuss issues' contexts, statuses, and present relevance, and sort/close outdated issues, maybe merge well-reviewed but forgotten PRs. I'd guess this is easily half an hour or more, should we schedule this separately? | 22:30:50 | |
You're certainly right, and the idea of promoting cuda fixes during ZHF has in fact been around. By the same token, an ofborg-like integration, an external service that would test a PR on-push and post a report on failures on non-default instantiations or involving out-of-tree tests is maybe even necessary to ensure stability of hw-accelerated packages. Even when a contributor doesn't care about cuda, it's important they are informed about unintended consequences of their changes, and maybe can ping the interested parties as needed | 22:41:27 | |
My javascript might be broken, but I only see build failures. Some errors under | 22:44:52 | |