NixOS CUDA | 284 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 58 Servers |
| Sender | Message | Time |
|---|---|---|
| 3 Dec 2025 | ||
(I was indeed thinking of ?priority ) | 22:22:39 | |
| it does now 😛 | 22:24:16 | |
| 23:55:01 | ||
| 4 Dec 2025 | ||
| Hi all! Very much appreciate the work that's been put into CUDA support in nixpkgs/NixOS. It seems nixpkgs updating to use cuDNN 9.13 means that other packages pulling in cudaPackages_12_{6,8,9} no longer support compute capabilities < 7.5 even though CUDA supports compute capabilities >= 5.0 up until the jump to 13.0. I noticed 9.13 is not the only version in nixpkgs though, what is the strategy around how many legacy versions of CUDA packages to maintain in nixpkgs? Sorry for all the questions, appreciate any advice 😅 | 01:57:08 | |
| any idea what is the difference between torch-bin and torchWithCuda ? | 12:28:06 | |
In reply to @aliarokapis:matrix.orgIirc torch-bin is torch not built from source and torchWithCuda is torch built from source with cuda enabled forced regardless of global configuration? | 13:34:25 | |
| Yes, this is it. | 13:46:25 | |
| and it is apparently in the nixos cache by defualt? | 14:05:18 | |
| * and it is apparently in the nixos cache by default? | 14:16:57 | |
I'm not sure torchWithCuda will be.For `cudaSupport-enabled packages, consider using the Flox binary cache, or the NixOS-CUDA one. | 14:28:52 | |
| I’ll try to answer this later today. Depending on how comfortable you are with Nix, pull in the overlay for CUDA-legacy (https://github.com/nixos-cuda/cuda-legacy) to add a bunch of manifests and then customize the package set to your liking by using override on the CUDA package set and providing the manifest version you want. The docs are lacking an example for this. As you discovered, NVCC may support capabilities but that doesn’t mean the big libraries most people use (cuDNN, libcublas, TensorRT, etc.) do. We have the unenviable job of either adopting the latest release for each version or fixing them in time and never updating. The decision is made more difficult by the fact NVIDIA seems to fix bugs by doing major/minor releases much more often than patch releases. The trace-verbose thing is handy but undocumented and only exists because implementations of the Problems RFC keep getting bikeshedded to death. We should probably have a section in the CUDA docs which list supported capabilities for each package set. Could be automatically generated given I added the available capabilities for each release to backendStdenv. | 16:28:20 | |
| god i hate computers | 16:29:35 | |
| Reminder to self: post about changes I’ve been working on / need (fix adding attributes to backendStdenv, nvcc multiple outputs again, ccache) | 16:33:13 | |
| 4 Aug 2022 | ||
| 03:26:42 | ||
| (hi, just came here to read + respond to this.) | 03:28:52 | |
| hey. i had previously sympathzied with samuela and like i said before had some of the same frustrations. i just edited my github comment to add "[CUDA] packages are universally complicated, fragile to package, and critical to daily operations. Nix being able to manage them is unbelievably helpful to those of us who work with them regularly, even if support is downgraded to only having an expectation of function on stable branches." | 03:29:14 | |
In reply to @tpw_rules:matrix.orgugh, 45 minutes? that's... not great. not to air dirty laundry but did you do what samuela did in the wandb PR and at least say that that wasn't a great thing to do? (not sure how else to word that, you get what i mean) | 03:30:23 | |
| no, i haven't yet, but i probably will | 03:31:03 | |
| i admittedly did that with a PR once, i forget how long the maintainer was requested for but i merged it because multiple people reported it fixed the issue. the maintainer said "hey, don't do that" after and now i do think twice before merging. so it could help, is what i'm saying. | 03:31:50 | |
| i'm not sure what went wrong with the wandb PR anyway, i think it was just a boneheaded move on the maintainer's part | 03:32:10 | |
| (it was also simple enough that it was fine and the maintainer said it looked good after) | 03:32:15 | |
| * i'm not sure what went wrong with the wandb PR anyway, i think it was just a boneheaded move on the merger's part | 03:32:19 | |
| but i thought most of the frustration was around packages which don't really involve CUDA breaking the fragile CUDA packages, and i'm not sure how the warning helps in this case. it's not like nixpkgs-review prints out the comments. maybe i'm wrong. but it is a legitimate problem | 03:34:19 | |
| the frustration that i see is that people are touching packages that he maintains, am i missing further context here? | 03:35:09 | |
| did you ever see this? https://discourse.nixos.org/t/nixpkgss-current-development-workflow-is-not-sustainable/18741 | 03:35:43 | |
| oh yes i did | 03:35:49 | |
| but that's not what the topic of this PR/the notice is, though? | 03:36:11 | |
| this wouldn't help that | 03:36:14 | |
| ~~is that what you're saying and i'm just lagging behind~~ | 03:36:27 | |
| no it wouldn't, but it reads to me like that's the underlying problem and this is a manifestation which can be controlled more easily. not to put thoughts in people's head | 03:37:07 | |