NixOS CUDA - Public Room Timeline

	NixOS CUDA	282 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	58 Servers

Load older messages

Sender	Message	Time
14 Nov 2025
Gaétan Lepage	Ok, I figured it out. `torch` and `torchWithoutRocm` have the same `outPaths`. So `torch` is getting filtered out in favor of `torchWithoutRocm`.	09:25:13
Ari Lotter	realized this isn't a 2.9 regression, it's a `-bin` vs source problem :/	18:37:14
Ari Lotter	bin works fine T_T	18:37:19
Ari Lotter	updated the ticket :)	18:37:29
Gaétan Lepage	I updated torch-bin to 2.9.1 yesterday. The PR for the source-based build is https://github.com/NixOS/nixpkgs/pull/461241	21:55:22
apyh	i see your commit message says torch 2.8->2.9, but it's actually 2.9->2.9.1 :)	21:56:32
Gaétan Lepage	Good catch, now fixed.	22:06:15
15 Nov 2025
	cafkafk joined the room.	12:47:57
Gaétan Lepage	SomeoneSerge (back on matrix) would you have a minute to take a look at the triton/torch bump? https://github.com/NixOS/nixpkgs/pull/461241	14:23:53
Gaétan Lepage	Built with and without CUDA. No obvious regressions.	14:24:19
17 Nov 2025
Bryan Honof	How would you go about conditionally setting `cudaCapabilities` when instantiating nixpkgs? I.e. Image I have this. { inputs = { nixpkgs = "github:nixos/nixpkgs?ref=nixos-25.05"; }; outputs = { self, nixpkgs }: { packages.x86_64-linux.default = let pkgs = import nixpkgs { overlays = [ ]; config = { allowUnfree = true; cudaSupport = true; cudaCapabilities = [ "..." "..." ]; }; }; in pkgs.hello; packages.aarch64-linux.default = let pkgs = import nixpkgs { overlays = [ ]; config = { allowUnfree = true; cudaSupport = true; cudaCapabilities = if isJetson then [ "..." "..." ] else [ "..." "..." ]; }; }; in pkgs.hello; }; } It's the aarch64-linux part specifically that I'm a bit stuck on. I have some cloud servers that have an NVIDIA GPUs in them that run aarch64-linux, but I also have some Jetson devices that are also considered aarch64-linux. And if I understand the whole thing correctly, I can't just set the `cudaCapabilities` list to include both the non-jetson and jetson capabilities, right? Or at least, than `isJetsonBuild` would just always eval to `true` even if the build was meant for the cloud server. Probably something stupid I'm just overlooking, sorry for bothering. 😅	17:35:32
SomeoneSerge (back on matrix)	It's the aarch64-linux part specifically that I'm a bit stuck There's aarch64-linux and there's aarch64-linux. It's an artifact of us not including cuda/rocm stuff in `hostPlatform` (yet). The `isJetsonBuild` should only evaluate to true if your cudaCapabilities are jetson capabilities	19:43:44
SomeoneSerge (back on matrix)	So it's not really about "setting `cudaCapabilities` conditionally", it's about instantiating nixpkgs for different platforms. For flakes you'd have to suffix the attributes of one of the aarch64-linux platforms, or move stuff to `legacyPackages`, but, of course, you could also simply not maintain the list of already-evaluated and not-really-overridable "recipes", i.e. drop the flake:)	19:47:42
SomeoneSerge (back on matrix)	Think I caught a touch of a cold, sorry	19:48:43
Robbie Buxton	In reply to @bjth:matrix.org How would you go about conditionally setting `cudaCapabilities` when instantiating nixpkgs? I.e. Image I have this. { inputs = { nixpkgs = "github:nixos/nixpkgs?ref=nixos-25.05"; }; outputs = { self, nixpkgs }: { packages.x86_64-linux.default = let pkgs = import nixpkgs { overlays = [ ]; config = { allowUnfree = true; cudaSupport = true; cudaCapabilities = [ "..." "..." ]; }; }; in pkgs.hello; packages.aarch64-linux.default = let pkgs = import nixpkgs { overlays = [ ]; config = { allowUnfree = true; cudaSupport = true; cudaCapabilities = if isJetson then [ "..." "..." ] else [ "..." "..." ]; }; }; in pkgs.hello; }; } It's the aarch64-linux part specifically that I'm a bit stuck on. I have some cloud servers that have an NVIDIA GPUs in them that run aarch64-linux, but I also have some Jetson devices that are also considered aarch64-linux. And if I understand the whole thing correctly, I can't just set the `cudaCapabilities` list to include both the non-jetson and jetson capabilities, right? Or at least, than `isJetsonBuild` would just always eval to `true` even if the build was meant for the cloud server. Probably something stupid I'm just overlooking, sorry for bothering. 😅 Aarch based nvidia data center gpus 👀, yeah if you get the correct map of the cuda capabilities it should work fine	19:58:02
Robbie Buxton	* Aarch based nvidia data center gpus 👀, yeah if you get the correct map of the cuda capabilities it should work fine Edit: misread, isJetsonBuild sounds funky so not sure	20:01:07
18 Nov 2025
connor (he/him)	isJetsonBuild and the like are set by cudaCapabilities. Jetson capabilities aren’t included by default because they’re niche architectures and prior to Thor needed separate binaries. If you just need to support Thor you can specify that capability with other ones. If you need to support Orin or Xavier there’s no clean way to do it. Like Serge said, they’re effectively different platforms but Nixpkgs doesn’t have a notion of accelerators and so has no way to differentiate. The only way we can tell in Nixpkgs is whether the Jetson capabilities are explicitly provided.	06:26:45
connor (he/him)	Would appreciate if someone could review https://github.com/NixOS/nixpkgs/pull/462761	06:36:03
SomeoneSerge (back on matrix)	Gaétan Lepage: not quite a morning slot, but wdyt about 21:15 Paris for the weekly?	14:13:14
connor (he/him)	I should be able to attend too	16:00:11
Gaétan Lepage	Way better for me.	16:14:49
19 Nov 2025
	Eymeric joined the room.	12:59:28
	Jeremy Fleischman (jfly) joined the room.	18:13:28
Jeremy Fleischman (jfly)	i'm confused about the compatibility story between whatever libcuda.so file i have in `/run/opengl-driver` and my nvidia kernel module. i've read through <nixos/modules/hardware/video/nvidia.nix> and i see that `hardware.graphics.extraPackages` basically gets set to `pkgs.linuxKernel.packages.linux_6_12.nvidiaPackages.stable.out` (or whatever kernel i have selected) how much drift (if any) is allowed here?	18:18:44
Jeremy Fleischman (jfly)	to avoid an XY problem: what i'm actually doing is experimenting with defining systemd nixos containers that run cuda software internally, and i'm not sure how to get the right libcuda.so's in those containers so they play nicely with the host's kernel	18:21:46
Jeremy Fleischman (jfly)	if the answer is "just keep them perfectly in sync with the host kernel's version", that's OK. just trying to flesh out my mental model	18:22:27
connor (he/him)	`libcuda.so` is provided by the NVIDIA CUDA driver, which for our purposes is generally part of the NVIDIA driver for your GPU. Do the systemd NixOS containers provide their own copy of NVIDIA's driver? If not, they wouldn't have `libcuda.so` available. The CDI stuff providing GPU access in containers provides /run/opengl-driver/lib (among other things): https://github.com/NixOS/nixpkgs/blob/6c634f7efae329841baeed19cdb6a8c2fc801ba1/nixos/modules/services/hardware/nvidia-container-toolkit/default.nix#L234-L237 General information about forward-backward compat is in NVIDIA's docs here: https://docs.nvidia.com/deploy/cuda-compatibility/#	18:31:45
Robbie Buxton	In reply to @jfly:matrix.org to avoid an XY problem: what i'm actually doing is experimenting with defining systemd nixos containers that run cuda software internally, and i'm not sure how to get the right libcuda.so's in those containers so they play nicely with the host's kernel If you run the host systems cuda kernel drivers ahead of the user mode drivers it’s normally fine provided it’s not a major version change (I.e 13 vs 12)	18:35:26
Jeremy Fleischman (jfly)	Do the systemd NixOS containers provide their own copy of NVIDIA's driver? If not, they wouldn't have libcuda.so available. afaik, they do not automatically do anything (please correct me if i'm wrong). i making them get their own libcuda.so by explicitly configuring them with `hardware.graphics.enable = true;` and `hardware.graphics.extraPackages`. mounting the cuda runtime from the host makes sense, though! thanks for the link to this nvidia-container-toolkit	18:39:03
Lun	What's the current best practice / future plans for impure GPU tests? Is the discussion in https://github.com/NixOS/nixpkgs/issues/225912 up to date? cc SomeoneSerge (back on matrix)	18:43:23

Show newer messages

Back to Room ListRoom Version: 9