NixOS CUDA - Public Room Timeline

	NixOS CUDA	336 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	64 Servers

Load older messages

Sender	Message	Time
11 May 2026
smudge (she/her)	should the nixos wiki cuda page's cache section be corrected? it recommends the `cache.nixos-cuda.org` cache and doesn't mention others which lead me to assume it was the one for public use. is `nix-community.cachix.org` (per this announcement) the one that should be primarily recommended on the wiki? as well as information about using flox's nixpkgs repo and its cache? Or, if the `nix-community.cachix.org` isn't actually allowed to distribute should flox be the primary suggested way to access a cuda cache? also should the information for `cache.nixos-cuda.org` be retained, just with a proper disclaimer that it is for internal use only, or how would you like that to be handled?	19:44:22
smudge (she/her)	should the nixos wiki cuda page's cache section be corrected? it recommends the `cache.nixos-cuda.org` cache and doesn't mention others which lead me to assume it was the one for public use. is `nix-community.cachix.org` (per this announcement) the one that should be primarily recommended on the wiki? as well as information about using flox's nixpkgs repo and its cache? Or, if the `nix-community.cachix.org` isn't technically allowed to distribute should flox be the primary suggested way to access a cuda cache? also should the information for `cache.nixos-cuda.org` be retained, just with a proper disclaimer that it is for internal use only, or how would you like that to be handled?	20:10:58
smudge (she/her)	should the nixos wiki cuda page's cache section be corrected? it recommends the `cache.nixos-cuda.org` cache and doesn't mention others which lead me to assume it was the one for public use. is `nix-community.cachix.org` (per this announcement) the one that should be primarily recommended on the wiki? as well as information about using flox's nixpkgs repo and its cache? Or, if the `nix-community.cachix.org` isn't technically allowed to distribute should flox be the primary suggested way to access a cuda cache? also should the information for `cache.nixos-cuda.org` be retained, just with a disclaimer that it is for internal use only?	20:14:14
13 May 2026
	smudge (she/her) changed their display name from smudge to smudge (she/her).	08:48:06
14 May 2026
	weriomat joined the room.	21:11:04
15 May 2026
	Grayson Tinker changed their display name from graysontinker to Grayson Tinker.	03:47:21
ccicnce113424	I'm currently working on improving the packaging of the NVIDIA driver. My plan, is to first split the namespace and driver extraction into two separate files without changing semantics with the assistance of LLM, and then switch from the current `passthru` pattern to using `scope`. The first step is now complete. Semantics should be unchanged. I'm not sure how to verify semantic equivalence (suggestions welcome!), but this PR works fine with my current NixOS configuration. As a side note, I also moved `nvidia-modprobe` into the `nvidiaPackages` namespace. Since these changes are mostly structural and don't alter semantics, could they be merged separately, with the scope migration and further work happening in a new PR?	09:07:14
ccicnce113424	https://github.com/NixOS/nixpkgs/pull/519313	09:07:25
16 May 2026
	asp345 joined the room.	13:24:37
18 May 2026
yorik.sar	Gaétan Lepage: Here's a different fix for the `libcusolvermp` issue that we had: let's add all of `cudaPackages` that change by PR to the PR check. https://github.com/nixos-cuda/hydra-jobsets/pull/29 - wdyt?	14:26:40
19 May 2026
	hilorioze changed their display name from Yan Hilorioze to hilorioze.	16:41:32
	Prayag Bhakar joined the room.	20:15:25
Prayag Bhakar	hey folks, I'm looking for feedback on this PR :: https://github.com/NixOS/nixpkgs/pull/515928 My goal with these changes is to speed up my NVIDIA Jetson Orin Nano builds by adding dGPU architectures targets for Jetson Xavier (sm_72), Jetson Orin (sm_87), Jetson Thor (sm_110). I'm happy to iterate more on the changes if there is feedback :)	20:22:59
Prayag Bhakar	if/when this gets merged, I was planning on raising a follow-up fix to the nixos-cuda inference package to support these new params. https://github.com/nixos-cuda/infra	20:25:01
Gaétan Lepage	Hi! Just so you know, `aarch64-linux` is not supported in our current CI.	21:24:45
Prayag Bhakar	hi! I did see that there wasn't a callout for an `aarch64-linux` system in the infra package. I was going to include more thoughts in the follow-up CR, but the few options I see are (a) enable cross compilation via qemu on current hardware (ie :: https://github.com/81reap/jetpack-nixos/blob/d20caea8befd5fdad99efc9fad5527e70124ca98/README.md?plain=1#L339-L351); (b) onboard my own Jetson Nano as a target to build things for the infra pipeline; or (c) I donate $250 for the NVIDIA Jetson Orin Nano Super Developer Kit to be purchased by one of the current maintainers	22:51:06
Prayag Bhakar	that said, I think the lowest bar to test this on the current CI is to enable cross complication. But that's just my current opinion. Happy to chat more and bar raise these thoughts :)	22:52:04
20 May 2026
connor (he/him)	Our infra is fairly limited in terms of capacity and doing builds through QEMU would be prohibitively expensive. Not to say this can't happen, but it's certainly not a platform we build for or test.	02:05:15
connor (he/him)	Gaétan Lepage: for gpu-burn: https://github.com/NixOS/nixpkgs/pull/522144 I still need to look into CUDA compat/fix the assumption in the logic that cudaVariant only ever has a major version suffix	05:12:41
connor (he/him)	And then also investigate the NVCC fatal error encountered when using family feature sets with baseline ones	05:15:40
connor (he/him)	For the CUDA variant stuff, look for changes here: https://github.com/NixOS/nixpkgs/blob/e72912dc31edae89f04a12982c684f9652c95348/pkgs/development/cuda-modules/_cuda/lib/cuda.nix#L75-L103 https://github.com/NixOS/nixpkgs/blob/e72912dc31edae89f04a12982c684f9652c95348/pkgs/development/cuda-modules/buildRedist/default.nix#L70-L97 Ugh so `getSupportedReleases` needs to support something like `desiredCudaVariants` being an ordered list	05:20:42
connor (he/him)	If you're able to poke at the gpu-burn PR I'd appreciate it. I've been running benchmarks with https://github.com/ConnorBaker/nix/tree/vibe-coding/optimise-and-gc-throughput-baseline-bench-rig-616df9797 and https://github.com/ConnorBaker/nix/tree/vibe-coding/optimise-and-gc-throughput before submitting PRs upstream to parallelize/make faster store optimise and gc. They've been running for two days and I don't want to fully load the system while it's doing that.	05:23:23
Gaétan Lepage	Thanks Prayag Bhakar for your suggestions. At the time, our CI capacity only allows us to build ~10% of our "target jobsets" for x86_64 alone. This means that cross-building for aarch64 is definitely not feasible with the current hardware. We are actively working in the background to secure "sponsorships" and get a legitimate compute capacity, but this is not done yet. Supporting Jetson architectures is definitely on the list of things we would like to do, but we are way too much hardware-bound for it.	08:34:33
Prayag Bhakar	I see, thanks for the input connor (burnt/out) (UTC-8) Gaétan Lepage just to clarify a few things for my understanding (1) This is only a blocker for updating the cuda infra pipeline, right? Or is this also a blocker for updating Nixpkgs to properly support aarch64? (2) How does the current infra work? Is its machines hosted by volunteers/maintainers? I'm trying to understand what "legitimate compute capacity means" and if solution B or C would be viable	13:18:50
SomeoneSerge (matrix works sometimes)	Hey, sorry for disappearing on the github ticket, v limited bandwidth currently. The first and most important point, elaborating on Gaétan's reply: our infra currently lacks aarch64 builders, and frankly we as of now haven't even a fraction of the x8664 capacity that we need. We are currently working on securing proper funding for the hardware and for the general effort, and first and foremost for reducing the harm and the extra load that CUDA imposes on the rest of Nixpkgs maintainers. It's been in the works for almost two years now. Recently our entire team, including our new "manager" member @dhofer:matrix.org, has been fully dedicating to making this happen, but, while there's been some very modest progress, it's going to take a while before we start testing and caching Jetsons or in fact anything besides the vanilla x86-64-linux nixpkgs simply due to cashflow considerations. We are actively looking for companies willing to properly pay for the service of testing and keeping the Jetson ecosystem "green" & functional here upstream in Nixpkgs, but so far there's no ETAs for this specific effort. In our personal projects we normally use very different kind of hardware, so it's not a priority for any of the maintainers. Regarding your other messages, just some clarifications: Binfmt/qemu is not "cross-compilation". Nixpkgs does have cross capabilities, albeit less stable than native builds, and we are also very interest in making cross compilation possible for CUDA projects (it currently isn't, mostly because nvidia is making it artificially hard), but it's not a priority for any of us, and we haven't found any interested customers or sponsors yet either. Binfmt builds in CI are indeed not impossible, and Connor and I had burned quite a bit of energy by trying them out in the old Hercules-based CI... It's been an incredible pain, besides also harming reproducibility and all. Jetsons without paying consumers and without our own research projects needing them is not something we'd want to spend our currently extremely limited and scarce compute on, I'm afraid. Re: the PR, as I mentioned we moved from nix-community infra to our own, and we had to temporarily pause consuming the release-cuda.nix file (which, though, remains the authoritative source, and which is also what's being built by Flox). I'd honestly rather first do the hard work of removing backendStdenv and config.cuda*, because then we wouldn't need to modify release-lib in this first place. That said, if this were a priority, we could hack jetsons in in the release-cuda file, e.g. to get them built by Flox (I forget if they do aarch64 though) Now to what is a priority to us, fot example: I'm happy to brainstorm with anyone about how do we get rid of the backendStdenv thing and how to cleanly model coprocessors/accelerators in the elaborated-system structure, whether to make coprocessors part of a system "quadruple" (contrast to triple), and whether the specific cudaCapabilities and/or rocm gpuTargets should become a part of... well, I suppose, the system "polycule". I know @lt1379:matrix.org and @tomberek:matrix.org , besides the CUDA Team have been pondering about the same questions.	18:22:16
SomeoneSerge (matrix works sometimes)	This too would be very welcome, but before we can make any use of that we need to secure the general aarch64 cpu build capacity!	18:24:17
Prayag Bhakar	no worries SomeoneSerge (matrix works sometimes) I'm grateful for any time folks are taking out of their day to help me out. I figured the conversation may be more fruitful in the matrix channel Binfmt/qemu is not "cross-compilation" ah, sorry I was under the impression it was close enough. Since option A is off the table, do options B and C have any legs? I understand that there is a process to secure more compute, but can that compute come from volunteers like me? From my understanding even with just a Jetson nano, it can start compiling other aarch64 packages until more capable compute is secured Now to what is a priority to us, oh I see, so have the GPU definitions be part of the core `hostPlatform` instead of treated like added on accelerators. Does this mean there are plans to also add core `hostPlatform` support for other accelerators like TPUs and FPGAs? Is there someplace I can read up on this body of work and try to figure out how I can help? I'm happy to scrap my PR if there is a more established north start being pursued by the team	22:56:23
Gaétan Lepage	While a Jetson can indeed build packages for `aarch64-linux`, having a single one would not enable anything. The amount of (decently fast) ARM cores needed to start enabling `aarch64-linux` jobsets on `hydra.nixos-cuda.org` is ~64 at the very extreme minimum. We appreciate the proposition, but until we get access to something resembling a serious build capacity for this platform, we won't spend time supporting it.	23:00:50
Prayag Bhakar	I see, so at the minimum targeting the $5k budget hardware range with Ampere Altra or an Apple M Series Mac (probably also need Asahi Linux). Is there a donation target/pool for this goal?	23:59:21
21 May 2026
Prayag Bhakar	* I see, so at the minimum targeting the $5k budget hardware range with Ampere Altra or a used/refurbished Apple M Series Mac (probably also need Asahi Linux). Is there a donation target/pool for this goal?	00:11:20

Show newer messages

Back to Room ListRoom Version: 9