NixOS CUDA - Public Room Timeline

	NixOS CUDA	286 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

Load older messages

Sender	Message	Time
5 Oct 2025
Daniel Fahey	draft PR, I'm building, will take a few hours, will check in the morning	22:35:48
Daniel Fahey	* draft PR, I'm building, will take a few hours, will check in the morning https://github.com/NixOS/nixpkgs/pull/448965	22:35:55
Daniel Fahey	* draft PR, my machine's building it, will take a few hours, will check in the morning https://github.com/NixOS/nixpkgs/pull/448965	22:37:12
lon	Interesting, when I was building vllm 0.11 yesterday I mistakenly took 4.2.1 from Cmakelists in `master` and I've been inferencing with it since yesterday. I have cudacapabilities 8.9, an "old" 4090 w/24gb, compiling takes ~45min in my i9 13900. IMO, instead of disabling building for sm100 I'd rather bump cutlass	23:26:51
lon	Indeed. My current config to try to steer it away from downloading models, while at the same time reasonably "caging it" in systemd has so many variables. I'm curious to know how you run vllm, here's a "Claude Code-extracted" version of what I use in my machines https://gist.github.com/longregen/e8146a3e34fb7f114b2da43ffa0d8023#file-configuration-nix-L25	23:36:40
6 Oct 2025
Daniel Fahey	Wow, this is great to see, personal AI for the people! Thanks for sharing I'll definitely be referring to it	06:20:02
SomeoneSerge (back on matrix)	RE: Diffing for `release-cuda.nix` Just chatted with Gaétan Lepage about "checked-in lists vs IFD vs pure eval diffing". Previously expressed my feelings in the context of ROCm here: https://github.com/NixOS/nixpkgs/pull/446976#issuecomment-3353986656. Tldr: no diffing > pure eval > ifd > checked in codegen lists (although vcunat suggests no-diffing may be infeasible) connor (he/him) (UTC-7)	09:56:44
Lun	Is there an example of acceptably fast diffing around somewhere? I landed on checked in diff because I couldn't work out how to make it fast and hexa had already tried a no diff jobset.	10:18:21
SomeoneSerge (back on matrix)	acceptably fast Not that I'm aware of. Gaetan pointed to `ci/eval/compare` (would need adjustment) for a derivation-level solution. I was thinking of building up on top of `release-lib.nix`. This one eval-level flat list routine I'd describe as "painfully slow": https://github.com/SomeoneSerge/nixpkgs-cuda-ci/blob/abee609531807217495cd15e6ced14ad0dee5d18/nix/utils.nix#L73-L85	10:24:41
SomeoneSerge (back on matrix)	* acceptably fast Not that I'm aware of. Gaetan pointed to `ci/eval/compare` (would need adjustment) for a derivation-level solution. I was thinking of building up on top of `release-lib.nix`. This one eval-level flat list routine I'd describe as "painfully slow": https://github.com/SomeoneSerge/nixpkgs-cuda-ci/blob/abee609531807217495cd15e6ced14ad0dee5d18/nix/utils.nix#L73-L85. Probably could be made less sequential	10:25:22
Daniel Fahey	Build fails with a simple CUTLASS bump https://github.com/NixOS/nixpkgs/pull/448965#issuecomment-3370979611 I suspect yours succeeded because you're using a `cudaCapabilities` with 8.9 only?	10:49:58
lon	Yes, probably that !	11:20:56
connor (burnt/out) (UTC-8)	I made a faster diffing thing (but it requires a fair amount of memory): https://github.com/ConnorBaker/nix-nixpkgs-review	14:33:23
connor (burnt/out) (UTC-8)	As an example: `nix build -L .#diffs.x86_64-linux.pkgs-pre-pkgs-cuda-pre --build-dir /run/temp-ramdisk --builders '' --override-input nixpkgs-pre github:NixOS/nixpkgs` will evaluate a copy of nixpkgs using the `nixpkgs-pre` input without CUDA enabled and with CUDA enabled, and then diff the results (each step happens in a separate derivation so there's caching) It's IO and memory hungry though (IO because it's instantiating ~1.5 GB worth of derivations) and memory hungry because it's evaluating all of Nixpksg in a single pass I've written it so it uses DetSys' parallel eval as well	14:39:10
connor (burnt/out) (UTC-8)	Here's the result of that command: https://gist.github.com/ConnorBaker/b1bbb3547d6c15921843ba0e048f94fd	14:41:08
connor (burnt/out) (UTC-8)	When the evaluations of Nixpkgs instantiations are done in the derivations, the `--eval-store` argument is set to the `evalStore` output so we can keep the derivations around. The entries in the `packages` output of the flake are small wrapper scripts which run a nix build using the added and changed derivations -- the `evalStore` outputs are used as extra substituters so derivations are copied as needed into the store and we avoid doing evaluation again	14:44:52
connor (burnt/out) (UTC-8)	Anyway, I built that because I didn't have a way to run `nixpkgs-review` with content-addressed derivations and got irritated that it kept evaluating the base commit of PRs that hadn't changed (all it needed to do was re-evaluate the head of the PR).	14:48:28
connor (burnt/out) (UTC-8)	(Using the scripts in `packages` does require the `read-only-local-store` feature be enabled, since the `evalStore` outputs from the reports are just small instances of Nix stores which are inside the Nix store, so they do need to be mounted as read-only.)	14:51:17
connor (burnt/out) (UTC-8)	But as an example, just with that diff output, if we change `python312Packages` and versioned package sets to their default, sort and deduplicate, that gives us massively better coverage of the packages which require CUDA that we can build (since allowBroken and allowInsecure are both still false)	14:54:32
SomeoneSerge (back on matrix)	~1.5 GB worth of derivations) LoL can we have evals run on a separate machine from the rest of hydra, and just do tmpfs store there xD	16:49:17
SomeoneSerge (back on matrix)	When the evaluations of Nixpkgs instantiations are done in the derivations, the --eval-store argument is set to the evalStore output so we can keep the derivations around Cursed	16:51:37
connor (burnt/out) (UTC-8)	When I had done something like this a year ago I had to use recursive nix so this is an “improvement” in that it only requires the read-only store experimental feature, which is much more limited	18:43:17
connor (burnt/out) (UTC-8)	I guess if you don’t mind re-evaluating everything there’s no need for evalStore since we could use an in-memory dummy store (probably wouldn’t need tmpfs build directory then either	18:44:47
connor (burnt/out) (UTC-8)	*	18:44:52
SomeoneSerge (back on matrix)	Well the real issue here is that lol why physical realizations of aterm drvs in the first place	19:44:03
7 Oct 2025
connor (burnt/out) (UTC-8)	Using toJSON calls addToStore on paths before serializing so they’re valid; I guess I could try using unsafeDiscardWhatever before serializing to JSON to see if that’s enough to prevent realization	01:32:05
connor (burnt/out) (UTC-8)	There was a `--read-only` flag I had forgotten about	03:47:47
connor (burnt/out) (UTC-8)	Okay, it's way faster now and doesn't need a ramdisk. Where as previously it would take about 1m40s on my i9-13900k now it's taking about 20s to do `time nix build -L ".#reports.x86_64-linux.pkgs-cuda-post^*" --builders '' --rebuild`	03:53:40
connor (burnt/out) (UTC-8)	For generating a single report htop shows it was taking about 18% of my RAM (less than 20GB)	03:57:11
connor (burnt/out) (UTC-8)	To clarify, it wasn’t enough to use unsafeDiscardStringContext because the derivation was instantiated as soon as drvPath was evaluated, even before toJSON. Using the read-only argument (which is different than the store URI query parameter of the same name lmao) avoids the instantiating.	06:54:57

Show newer messages

Back to Room ListRoom Version: 9