NixOS CUDA | 282 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 59 Servers |
| Sender | Message | Time |
|---|---|---|
| 5 Oct 2025 | ||
| one could also automate updating e.g. https://github.com/NixOS/nixpkgs/blob/107f8b572eb41058b610f99aba21b9a1b5925cf8/pkgs/development/python-modules/vllm/default.nix#L183-216, but I thought what I'd done was try-hard over-engineering enough already I really wanted to try and make a reference implementation that could easily be adapted to other complicated Python packages that have multiple git deps | 12:54:29 | |
| https://www.explainxkcd.com/wiki/index.php/1319:_Automation | 12:55:42 | |
| vLLM is becoming a huge project and pillar in the ecosystem, sometimes their cadence for releases is daily, and going through and checking each is fiddly and tedious. May well save us all some time, but it's good to just not have to worry https://www.explainxkcd.com/wiki/index.php/1205:_Is_It_Worth_the_Time%3F hope it keeps working, lol | 13:01:36 | |
| * vLLM is becoming a huge project and pillar in the ecosystem, sometimes their cadence for releases is daily, and going through and checking each dep is fiddly and tedious. May well save us all some time, but it's good to just not have to worry https://www.explainxkcd.com/wiki/index.php/1205:_Is_It_Worth_the_Time%3F hope it keeps working, lol | 13:02:02 | |
| CUDA build! Great to see it on your Hydra server though. I might have time in the evenings this week to get to the bottom of it, but don't have access to some beefy compute with 128+ GB RAM and 16+ CPU to do the CUDA build in a reasonable timeframe https://hydra.nixos-cuda.org/build/824/nixlog/71 | 13:29:19 | |
| * CUDA build! Great to see it on your Hydra server though. I might have time in the evenings this week to get to the bottom of it, but don't have access to some beefy compute with 128+ GB RAM and 16+ CPU to do the CUDA build in a reasonable timeframe (going to be investing in an Epyc rackmount rig eventually, as my Nixbuild.net bill can get silly fast) https://hydra.nixos-cuda.org/build/824/nixlog/71 | 13:31:47 | |
| Had a quick look while on the Tube, failure cascade stems from FlashMLA build | 15:47:38 | |
I think vLLM v0.11.0 CUDA build fails because FlashMLA's SM100 (Blackwell) code requires CUTLASS v4.2.1+ APIs like make_counting_tensor, but the derivation uses CUTLASS v4.0.0.So I just disable the problematic SM100 kernels, builds chugging along, seems to have got further than the one on Hydra, going to get a PR up now. | 21:45:40 | |
* I think vLLM v0.11.0 CUDA build fails because FlashMLA's SM100 (Blackwell) code requires CUTLASS v4.2.1+ APIs like make_counting_tensor, but the derivation uses CUTLASS v4.0.0.So I just disabled the problematic SM100 kernels, build's chugging along, seems to have got further than the one on Hydra, going to get a PR up now. | 21:46:01 | |
I didn't bump cutlass because vllm 0.11.0 still uses v4.0.0:https://github.com/vllm-project/vllm/blob/v0.11.0/CMakeLists.txt#L273 | 21:53:29 | |
| Now, maybe, bumping it nonetheless could fix the issue. | 21:53:40 | |
| v0.11.0 uses CUTLASS v4.0.0, their next version will bump it | 21:56:35 | |
| see https://github.com/vllm-project/vllm/commit/5234dc74514a6b3d0740b39f56a4a4208ec86ecc (part of https://github.com/vllm-project/vllm/pull/24673) | 21:58:06 | |
| we could sort "backport" the version though, yeah | 21:58:23 | |
| as a fix | 21:58:27 | |
| * yeah see https://github.com/vllm-project/vllm/commit/5234dc74514a6b3d0740b39f56a4a4208ec86ecc (part of https://github.com/vllm-project/vllm/pull/24673) | 21:58:46 | |
| * yep v0.11.0 uses CUTLASS v4.0.0, their next version will bump it | 21:58:58 | |
|
| 22:28:21 | |
| draft PR, I'm building, will take a few hours, will check in the morning | 22:35:48 | |
| * draft PR, I'm building, will take a few hours, will check in the morning https://github.com/NixOS/nixpkgs/pull/448965 | 22:35:55 | |
| * draft PR, my machine's building it, will take a few hours, will check in the morning https://github.com/NixOS/nixpkgs/pull/448965 | 22:37:12 | |
Interesting, when I was building vllm 0.11 yesterday I mistakenly took 4.2.1 from Cmakelists in master and I've been inferencing with it since yesterday. I have cudacapabilities 8.9, an "old" 4090 w/24gb, compiling takes ~45min in my i9 13900. IMO, instead of disabling building for sm100 I'd rather bump cutlass | 23:26:51 | |
| Indeed. My current config to try to steer it away from downloading models, while at the same time reasonably "caging it" in systemd has so many variables. I'm curious to know how you run vllm, here's a "Claude Code-extracted" version of what I use in my machines https://gist.github.com/longregen/e8146a3e34fb7f114b2da43ffa0d8023#file-configuration-nix-L25 | 23:36:40 | |
| 6 Oct 2025 | ||
| Wow, this is great to see, personal AI for the people! Thanks for sharing I'll definitely be referring to it | 06:20:02 | |
| RE: Diffing for Just chatted with Gaétan Lepage about "checked-in lists vs IFD vs pure eval diffing". Previously expressed my feelings in the context of ROCm here: https://github.com/NixOS/nixpkgs/pull/446976#issuecomment-3353986656. Tldr: no diffing > pure eval > ifd > checked in codegen lists (although vcunat suggests no-diffing may be infeasible) | 09:56:44 | |
| Is there an example of acceptably fast diffing around somewhere? I landed on checked in diff because I couldn't work out how to make it fast and hexa had already tried a no diff jobset. | 10:18:21 | |
Not that I'm aware of. Gaetan pointed to This one eval-level flat list routine I'd describe as "painfully slow": https://github.com/SomeoneSerge/nixpkgs-cuda-ci/blob/abee609531807217495cd15e6ced14ad0dee5d18/nix/utils.nix#L73-L85 | 10:24:41 | |
*
Not that I'm aware of. Gaetan pointed to This one eval-level flat list routine I'd describe as "painfully slow": https://github.com/SomeoneSerge/nixpkgs-cuda-ci/blob/abee609531807217495cd15e6ced14ad0dee5d18/nix/utils.nix#L73-L85. Probably could be made less sequential | 10:25:22 | |
| Build fails with a simple CUTLASS bump https://github.com/NixOS/nixpkgs/pull/448965#issuecomment-3370979611 I suspect yours succeeded because you're using a | 10:49:58 | |
| Yes, probably that ! | 11:20:56 | |