!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

254 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda47 Servers

Load older messages


SenderMessageTime
26 Apr 2025
@hexa:lossy.networkhexa
❯ objdump -x result/lib/libonnxruntime.so | grep -A1 "STACK off"
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rwx
19:52:55
@hexa:lossy.networkhexa
❯ objdump -x result/lib/libonnxruntime.so | grep -A1 "STACK off"
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
19:53:01
@hexa:lossy.networkhexa implies systemd units that depend on onnxruntime and have MemoryDenyWriteExecute need to be updated to allow it 19:53:33
@connorbaker:matrix.orgconnor (he/him)

I don’t know if anyone else uses torchmetrics, but if you’re wondering why using DISTS is so freaking slow, it’s because they create a new instance of the model every time you call it: https://github.com/Lightning-AI/torchmetrics/blob/60e7686c97c14a4286825ec23187b8629f825d15/src/torchmetrics/functional/image/dists.py#L176

I tried just creating the model once and using it directly, and it is much faster, but something about doing that causes a memory leak which makes training OOM eventually :(

At any rate, it’s not the packaging’s fault, woohoo

19:58:30
29 Apr 2025
@connorbaker:matrix.orgconnor (he/him)finally started writing more docs (https://github.com/ConnorBaker/cuda-packages/blob/main/doc/language-frameworks/cuda.section.md) and moving some new package expressions (cuda-python, cutlass, flash-attn, modelopt, pyglove, schedulefree, transformer-engine) to my public repo (https://github.com/ConnorBaker/cuda-packages/tree/main/pkgs/development/python-modules)04:50:56
@ygt:matrix.org@ygt:matrix.org left the room.23:42:49
1 May 2025
@connorbaker:matrix.orgconnor (he/him)God I need to finish arrayUtilities so I can start landing CUDA setup hooks19:07:13
@oak:universumi.fioak 🏳️‍🌈♥️ changed their display name from oak - mikatammi.fi to oak 🫱⭕🫲.23:18:34
@connorbaker:matrix.orgconnor (he/him) Kevin Mittman: is it intentional that the CUDA 12.9 docs (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id7) say they require a driver version >=575.51.03 for 12.9, but the latest release is 575.51.02 (https://download.nvidia.com/XFree86/Linux-x86_64/575.51.02/)? 23:27:12
@justbrowsing:matrix.orgKevin Mittman
In reply to @connorbaker:matrix.org
Kevin Mittman: is it intentional that the CUDA 12.9 docs (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id7) say they require a driver version >=575.51.03 for 12.9, but the latest release is 575.51.02 (https://download.nvidia.com/XFree86/Linux-x86_64/575.51.02/)?
CUDA 12.9.0 ships with driver 575.51.03
what you are seeing is a separate release from GeForce BU
23:29:15
@connorbaker:matrix.orgconnor (he/him)It's also newer than the open kernel modules then?23:29:42
@justbrowsing:matrix.orgKevin MittmanExperiencing technical difficulties23:30:00
2 May 2025
@luke-skywalker:matrix.orgluke-skywalker is there a way to pick up on the nvidia-container-toolkit-tools directory containing the runtimes at build time? for example /nix/store/72bp8mb7zzpjifcwasj5wh45ixasmck7-nvidia-container-toolkit-1.17.6-tools 10:36:29
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)getOutput at eval time13:25:02
4 May 2025
@luke-skywalker:matrix.orgluke-skywalkerInteresting. Damn working through the nixOS Manual is still on my bucket list.10:27:03
@luke-skywalker:matrix.orgluke-skywalker

took me a a few looks but

"${getOutput "tools" pkgs.nvidia-container-toolkit}" 

is actually pretty straight forward.

20:05:52
@luke-skywalker:matrix.orgluke-skywalker🙏thx a lot20:06:01
5 May 2025
@glepage:matrix.orgGaétan Lepage Hi there,
I'm working on bumping pytorch to 2.7.0. They now require libcufile.so. Are you aware of this library? Is it already available in nixpkgs?
12:26:45
@glepage:matrix.orgGaétan Lepage -> cudaPackages.libcufile FTW. 12:35:28
6 May 2025
@connorbaker:matrix.orgconnor (he/him)https://github.com/NixOS/nixpkgs/pull/385960 is finally ready for review I am so tired of writing tests00:48:54
@connorbaker:matrix.orgconnor (he/him):L looks like OpenCV 4.11 doesn't build with CUDA 12.9 due to changes in libnpp https://gist.github.com/ConnorBaker/f258bfa80c82f92c34850482117fa00f03:44:18
@connorbaker:matrix.orgconnor (he/him) SomeoneSerge (UTC+U[-12,12]): would you mind reviewing / merging https://github.com/NixOS/nixpkgs/pull/404686?
Wanted to get file moves out of the way since setup hook changes will cause rebuilds
17:00:31
7 May 2025
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)

connor (he/him) (UTC-7): RE: manifest data observability

I'll be pushing stuff here as I extend the schema https://cuda-index.someonex.net/sidx/CudaComponent

11:26:34
@connorbaker:matrix.orgconnor (he/him)God that’s so cool13:05:35
@connorbaker:matrix.orgconnor (he/him) SomeoneSerge (UTC+U[-12,12]): I've got another one for you https://github.com/NixOS/nixpkgs/pull/404973 16:41:25
@connorbaker:matrix.orgconnor (he/him)As an added benefit, one could float out imports so that only happens once per nixpkgs instantiation16:42:23
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)If you're not using aderall idk what you're using xD16:57:02
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)So fast16:57:06
@connorbaker:matrix.orgconnor (he/him) Well, that one's sorta broken at the moment because cuda-library-samples doesn't mark derivations as broken if they rely on something unavailable in the current version of the package set 17:12:37
@connorbaker:matrix.orgconnor (he/him)I mean yeah for attention so I can survive my commute; the CUDA stuff is mostly because I think about it every waking moment. (Although I also usually have nightmares about it, too...)17:51:11

Show newer messages


Back to Room ListRoom Version: 9