!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

289 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
1 Jan 2025
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) * pushed the changes I had locally for nix-cuda-test (https://github.com/ConnorBaker/nix-cuda-test), if anyone wants to play with transformer engine or flash attention (both for PyTorch). I'll probably work on upstreaming those at some indeterminate point in time, but I don't know if they'll work with what's in-tree right now 03:59:24
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) SomeoneSerge (utc+3): are you aware of a clean, cross-platform way to handle patching the path to libcuda.so (as needed in https://github.com/NixOS/nixpkgs/pull/369495#issuecomment-2566002172)? Is it fair to assume that on non-NixOS systems, whatever wrapper people use (like nixGL or nixglhost) will add libcuda.so to LD_LIBRARY_PATH? 04:00:43
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)A new hope? https://www.phoronix.com/news/ZLUDA-v4-Released11:14:50
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)https://github.com/NixOS/nixpkgs/pull/36995613:22:40
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.14:26:32
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Thanks for the feedback Serge :)20:37:37
5 Jan 2025
@techyporcupine:matrix.org@techyporcupine:matrix.org set a profile picture.20:36:11
7 Jan 2025
@ruroruro:matrix.orgruro joined the room.03:39:47
@ss:someonex.netSomeoneSerge (back on matrix)Hi, sorry if I missed anything, my homeserver was offline for a week21:58:59
Jitsi widget removed by @ss:someonex.netSomeoneSerge (back on matrix)21:59:20
@tomberek:matrix.orgtomberek joined the room.22:40:13
9 Jan 2025
@hexa:lossy.networkhexa connor (he/him) (UTC-7): can we get your result of https://github.com/microsoft/onnxruntime/issues/22855 ported into nixpkgs as well? 02:08:59
@hexa:lossy.networkhexahttps://github.com/NixOS/nixpkgs/pull/364362 is still stuck on that issue02:09:12
@hexa:lossy.networkhexaoh, I see your're in the LA area02:12:06
@hexa:lossy.networkhexatake care!02:12:08
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Check out how I packaged https://github.com/ConnorBaker/cuda-packages/tree/main/cuda-packages/common/cudnn-frontend and https://github.com/ConnorBaker/cuda-packages/blob/main/cuda-packages/common/onnxruntime/package.nix05:56:56
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)
In reply to @hexa:lossy.network
oh, I see your're in the LA area
I’m about an hour south so I’m fine apart from poor air quality
15:01:45
@hexa:lossy.networkhexatorch and tensorboard on python3.13 https://github.com/NixOS/nixpkgs/pull/37240615:33:12
@glepage:matrix.orgGaétan LepageThanks for handling that !15:37:46
11 Jan 2025
@oak:universumi.fioak 🏳️‍🌈♥️ removed their profile picture.16:46:07
@hexa:lossy.networkhexacan I get some guidance on cudaPackages? 02:24:33
@oak:universumi.fioak 🏳️‍🌈♥️ set a profile picture.16:47:04
@hexa:lossy.networkhexahttps://github.com/NixOS/nixpkgs/pull/36436202:24:37
@hexa:lossy.networkhexa

error: evaluation aborted with the following error message: 'lib.customisation.callPackageWith: Function called without required argument "cuda_cccl" at /nix/store/sj06sl54sc0rxlj0g52pd3pq3glyvpak-source/pkgs/development/cuda-modules/cudnn-frontend/default.nix:5'

02:24:46
@hexa:lossy.networkhexaok, guess I need to pick them out of cudaPackages02:33:01
@ss:someonex.netSomeoneSerge (back on matrix) Odd, I thought we nuked old cuda releases that didn't have cuda_cccl? 11:47:40
@hexa:lossy.networkhexaoh yeah, that would explain why I could build it just fine, but eval would fail 🙂 16:09:02
13 Jan 2025
@ruroruro:matrix.orgruro

Hi, everyone. In my experience, CUDA packages and CUDA-enabled packages when cudaSupport = true; are quite often broken in nixpkgs (more often than other packages).

For example, https://hydra.nix-community.org/jobset/nixpkgs/cuda/evals has a bunch of Eval Errors and build errors and I don't remember the last time that it was green (although some of those eval errors might not be indicative of actually broken packages).

I was thinking that we might be able to improve the situation by making general nixpkgs contributors more aware of this situation. For example, it would be pretty cool if we could track the nix-community hydra builds on status.nixos.org, on zh.fail (and try to include CUDA packages in future ZHF events).

Also, I understand why hydra.nixos.org doesn't build CUDA packages, but do you think that we could enable evaluation-only checks for CUDA packages on nixpkgs github PRs and then build those PRs using the nix-community builders and report the results on the PR?

Finally, I was wondering if there is some canonical place to track/discuss CUDA-specific build failures in nixpkgs?

14:27:12
@ruroruro:matrix.orgruro *

Hi, everyone. In my experience, CUDA packages and CUDA-enabled packages when cudaSupport = true; are quite often broken in nixpkgs (more often than other packages).

For example, https://hydra.nix-community.org/jobset/nixpkgs/cuda/evals has a bunch of Eval Errors and build errors and I don't remember the last time that it was green (although some of those eval errors might not be indicative of actually broken packages).

I was thinking that we might be able to improve the situation by making general nixpkgs contributors more aware of this situation. For example, it would be pretty cool if we could track the nix-community hydra builds on status.nixos.org and on zh.fail (and try to include CUDA packages in future ZHF events).

Also, I understand why hydra.nixos.org doesn't build CUDA packages, but do you think that we could enable evaluation-only checks for CUDA packages on nixpkgs github PRs and then build those PRs using the nix-community builders and report the results on the PR?

Finally, I was wondering if there is some canonical place to track/discuss CUDA-specific build failures in nixpkgs?

14:28:08
@ruroruro:matrix.orgruroIt feels like fixing CUDA packages currently is "treadmill work" where some package gets fixed only for something else to get broken by unrelated changes in nixpkgs (because the current automation on github PRs doesn't check CUDA-enabled versions of packages).14:35:04

Show newer messages


Back to Room ListRoom Version: 9