NixOS CUDA - Public Room Timeline

	NixOS CUDA	274 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	55 Servers

Load older messages

Sender	Message	Time
31 Oct 2025
connor (burnt/out) (UTC-8)	gonna slow down and take a step back for a bit	03:18:20
Daniel Fahey	You said upfront "The addition of CUDA 13 does not mean packages will suddenly work with CUDA 13. Expect breakages." I know I'm just a random bloke from GitHub an fairly new but I've had really bad burnout in the past, I'd suggest still doing a little bit of triaging and technical support here and there for the CUDA Team in strict time blocks, so you can at least see the fruits of your labour (given to the world, for free) as the breakages all get sorted out in the coming weeks in our collective efforts. From my perspective I'm just excited about the prospect of using CUDA 13 with Nixpkgs, I've basicaly used `nixos-unstable`, sometimes `master`, since starting to use Nix, and still have no idea how the release cycle is supposed to work 🙃. I reckon for the next big CUDA update, do something like how the `haskell-updates` branch gets merged into `staging` first.	10:26:33
Daniel Fahey	* You said upfront "The addition of CUDA 13 does not mean packages will suddenly work with CUDA 13. Expect breakages." I know I'm just a random bloke from GitHub and fairly new but I've had really bad burnout in the past, I'd suggest still doing a little bit of triaging and technical support here and there for the CUDA Team in strict time blocks, so you can at least see the fruits of your labour (given to the world, for free) as the breakages all get sorted out in the coming weeks in our collective efforts. From my perspective I'm just excited about the prospect of using CUDA 13 with Nixpkgs, I've basicaly used `nixos-unstable`, sometimes `master`, since starting to use Nix, and still have no idea how the release cycle is supposed to work 🙃. I reckon for the next big CUDA update, do something like how the `haskell-updates` branch gets merged into `staging` first.	10:26:44
connor (burnt/out) (UTC-8)	I mean I’ll still be around, just not doing as much. I’ll still be in the team weeklies, etc.	14:35:31
connor (burnt/out) (UTC-8)	CUDA 13 isn’t the default because the stuff we have in tree is too old or doesn’t support it; the expect breakages was in reference to trying to use CUDA 13 as the default.	14:36:20
connor (burnt/out) (UTC-8)	Haskell stuff goes into staging (at least partly) because of the sheer number of packages, to allow Hydra to churn through them. None of our stuff is built upstream, so there’s not really a point.	14:37:36
Robbie Buxton	I think also a fair amount of stuff upstream doesn’t even build with cuda 13 yet either	14:37:54
connor (burnt/out) (UTC-8)	Yeah NVIDIA does not care outside of projects they dedicate engineering hours to supporting, and changing the default version of OpenCV or other large projects to a commit from master adding support would be dead on arrival, and trying to special case it just for when CUDA is configured would be difficult.	14:39:51
	Daniel Fahey set a profile picture.	14:56:01
Daniel Fahey	This is quite a convincing argument to revert the 99 commits https://github.com/NixOS/nixpkgs/pull/437723#issuecomment-3472997390 Maybe there could be a `cuda-refactor` branch that is continually built and tested by e.g. https://hydra.nixos-cuda.org/jobset/nixpkgs/cuda-refactor while it gets the attention it deserves?	15:50:50
Daniel Fahey	(all a bit over my current pay grade with my limited Nixpkgs experience though, lol) just really want to express my gratitude to the CUDA Team	15:52:21
Robbie Buxton	My understanding (which may be incorrect) is that CUDA 13 is opt in so will only break if you try and use it instead of the default?	16:03:44
connor (burnt/out) (UTC-8)	Gaétan Lepage SomeoneSerge (back on matrix) are you okay with merging: https://github.com/NixOS/nixpkgs/pull/457338 https://github.com/NixOS/nixpkgs/pull/457220 I’d like there to be consensus as a team for those reverts to go through. Serge, I know you’re in favor of the config.cudaSupport one, but I’d like to issue the statement/decision as a team.	19:40:25
connor (burnt/out) (UTC-8)	Correct	19:46:10
connor (burnt/out) (UTC-8)	We don’t have anywhere near the capacity (hardware or labor) to do that on a regular cadence, but that would be nice	19:47:00
apyh	what kind of hardware is needed for reasonably-fast-ish compile cycles?	19:59:36
connor (burnt/out) (UTC-8)	That depends entirely on what you’re building. My suggestion is to compile for exactly the CUDA capabilities you need — the CUDA compiler and linker is incredibly slow so it helps a lot.	20:01:29
apyh	yeah makes sense - was seeing if i could volunteer a personal machine to help make the dev cycle possible 😓	20:02:07
Robbie Buxton	From experience adding compute 12 capability doubled my PyTorch build time so def keep an eye on it	20:02:37
Gaétan Lepage	We have very recently acquired new hardware. That is still far from the perfect infra, but it's definitely a good progress.	20:02:54
Gaétan Lepage	I broke the record yesterday building `python3Packages.torch` with `cudaSupport` enabled. -> 41 min on 96 cores.	20:03:48
Gaétan Lepage	Do not try to replicate on your laptop 🫠	20:04:03
Gaétan Lepage	connor (burnt/out) (UTC-7) ACK for both.	20:04:17
Robbie Buxton	I’ve oomed a machine with over 1tb of ram building nix cuda packages 😎	20:04:38
apyh	In reply to @glepage:matrix.org I broke the record yesterday building `python3Packages.torch` with `cudaSupport` enabled. -> 41 min on 96 cores. omg. i wanna try.	20:04:40
Gaétan Lepage	I have only 128GB of RAM on my builder. So I got swap to a (sometimes necessary) 500GB size.	20:05:45
Gaétan Lepage	`ptxas` can be very expensive memory-wise...	20:06:15
Gaétan Lepage	connor (burnt/out) (UTC-7) I found out the issue with `firefox`. Both prior and after the CUDA 13 PR, `cudaPackages.backendStdenv.cc` (`gcc-wrapper`) was leaking into the `firefox` output. However, before the CUDA 13 PR, `stdenv` and `cudaPackages.backendStdenv` were not the same. After the CUDA 13 PR, `stdenv` and `cudaPackages.backendStdenv` are the same. Hence, `disallowedRequisites = [ stdenv.cc ];` catches the `nvcc`-leaked `gcc-wrapper` (`cudaPackages.backendStdenv.cc`). So, who's fault is it? A) It is wrong that `stdenv == cudaPackages.backendStdenv`. Then the issue is not `cudaPackages.cuda_nvcc` leaking `gcc-wrapper`. B) It is normal that `stdenv == cudaPackages.backendStdenv`, but `cudaPackages.cuda_nvcc` should have never leaked `gcc-wrapper` in the first place.	21:21:05
Gaétan Lepage	* connor (burnt/out) (UTC-7) I found out the issue with `firefox`. Both prior and after the CUDA 13 PR, `cudaPackages.backendStdenv.cc` (`gcc-wrapper`) was leaking into the `firefox` output. However, before the CUDA 13 PR, `stdenv` and `cudaPackages.backendStdenv` were not the same. After the CUDA 13 PR, `stdenv` and `cudaPackages.backendStdenv` are the same. Hence, in `firefox` (wrapper) derivation, `disallowedRequisites = [ stdenv.cc ];` catches the `nvcc`-leaked `gcc-wrapper` (`cudaPackages.backendStdenv.cc`). So, who's fault is it? A) It is wrong that `stdenv == cudaPackages.backendStdenv`. Then the issue is not `cudaPackages.cuda_nvcc` leaking `gcc-wrapper`. B) It is normal that `stdenv == cudaPackages.backendStdenv`, but `cudaPackages.cuda_nvcc` should have never leaked `gcc-wrapper` in the first place.	21:21:43
apyh	In reply to @apyh:matrix.org omg. i wanna try. ripped it on your branch in 23m, including thr magma compile - only compute 8.9 tho	21:38:03

Show newer messages

Back to Room ListRoom Version: 9