26 Apr 2025 |
hexa | ❯ objdump -x result/lib/libonnxruntime.so | grep -A1 "STACK off"
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rwx
| 19:52:55 |
hexa | ❯ objdump -x result/lib/libonnxruntime.so | grep -A1 "STACK off"
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
| 19:53:01 |
hexa | implies systemd units that depend on onnxruntime and have MemoryDenyWriteExecute need to be updated to allow it | 19:53:33 |
connor (he/him) | I don’t know if anyone else uses torchmetrics, but if you’re wondering why using DISTS is so freaking slow, it’s because they create a new instance of the model every time you call it: https://github.com/Lightning-AI/torchmetrics/blob/60e7686c97c14a4286825ec23187b8629f825d15/src/torchmetrics/functional/image/dists.py#L176
I tried just creating the model once and using it directly, and it is much faster, but something about doing that causes a memory leak which makes training OOM eventually :(
At any rate, it’s not the packaging’s fault, woohoo | 19:58:30 |
29 Apr 2025 |
connor (he/him) | finally started writing more docs (https://github.com/ConnorBaker/cuda-packages/blob/main/doc/language-frameworks/cuda.section.md) and moving some new package expressions (cuda-python, cutlass, flash-attn, modelopt, pyglove, schedulefree, transformer-engine) to my public repo (https://github.com/ConnorBaker/cuda-packages/tree/main/pkgs/development/python-modules) | 04:50:56 |
| @ygt:matrix.org left the room. | 23:42:49 |
1 May 2025 |
connor (he/him) | God I need to finish arrayUtilities so I can start landing CUDA setup hooks | 19:07:13 |
| oak 🏳️🌈♥️ changed their display name from oak - mikatammi.fi to oak 🫱⭕🫲. | 23:18:34 |
connor (he/him) | Kevin Mittman: is it intentional that the CUDA 12.9 docs (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id7) say they require a driver version >=575.51.03 for 12.9, but the latest release is 575.51.02 (https://download.nvidia.com/XFree86/Linux-x86_64/575.51.02/)? | 23:27:12 |
Kevin Mittman | In reply to @connorbaker:matrix.org Kevin Mittman: is it intentional that the CUDA 12.9 docs (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id7) say they require a driver version >=575.51.03 for 12.9, but the latest release is 575.51.02 (https://download.nvidia.com/XFree86/Linux-x86_64/575.51.02/)? CUDA 12.9.0 ships with driver 575.51.03
what you are seeing is a separate release from GeForce BU | 23:29:15 |
connor (he/him) | It's also newer than the open kernel modules then? | 23:29:42 |
Kevin Mittman | Experiencing technical difficulties | 23:30:00 |
2 May 2025 |
luke-skywalker | is there a way to pick up on the nvidia-container-toolkit-tools directory containing the runtimes at build time? for example /nix/store/72bp8mb7zzpjifcwasj5wh45ixasmck7-nvidia-container-toolkit-1.17.6-tools | 10:36:29 |
SomeoneSerge (Ever OOMed by Element) | getOutput at eval time | 13:25:02 |
4 May 2025 |
luke-skywalker | Interesting. Damn working through the nixOS Manual is still on my bucket list. | 10:27:03 |
luke-skywalker | took me a a few looks but
"${getOutput "tools" pkgs.nvidia-container-toolkit}"
is actually pretty straight forward.
| 20:05:52 |
luke-skywalker | 🙏thx a lot | 20:06:01 |
5 May 2025 |
Gaétan Lepage | Hi there,
I'm working on bumping pytorch to 2.7.0. They now require libcufile.so . Are you aware of this library? Is it already available in nixpkgs? | 12:26:45 |
Gaétan Lepage | -> cudaPackages.libcufile FTW. | 12:35:28 |
6 May 2025 |
connor (he/him) | https://github.com/NixOS/nixpkgs/pull/385960 is finally ready for review
I am so tired of writing tests | 00:48:54 |
connor (he/him) | :L
looks like OpenCV 4.11 doesn't build with CUDA 12.9 due to changes in libnpp https://gist.github.com/ConnorBaker/f258bfa80c82f92c34850482117fa00f | 03:44:18 |
connor (he/him) | SomeoneSerge (UTC+U[-12,12]): would you mind reviewing / merging https://github.com/NixOS/nixpkgs/pull/404686?
Wanted to get file moves out of the way since setup hook changes will cause rebuilds | 17:00:31 |
7 May 2025 |
SomeoneSerge (Ever OOMed by Element) | connor (he/him) (UTC-7): RE: manifest data observability
I'll be pushing stuff here as I extend the schema https://cuda-index.someonex.net/sidx/CudaComponent
| 11:26:34 |
connor (he/him) | God that’s so cool | 13:05:35 |
connor (he/him) | SomeoneSerge (UTC+U[-12,12]): I've got another one for you https://github.com/NixOS/nixpkgs/pull/404973 | 16:41:25 |
connor (he/him) | As an added benefit, one could float out imports so that only happens once per nixpkgs instantiation | 16:42:23 |
SomeoneSerge (Ever OOMed by Element) | If you're not using aderall idk what you're using xD | 16:57:02 |
SomeoneSerge (Ever OOMed by Element) | So fast | 16:57:06 |
connor (he/him) | Well, that one's sorta broken at the moment because cuda-library-samples doesn't mark derivations as broken if they rely on something unavailable in the current version of the package set | 17:12:37 |
connor (he/him) | I mean yeah for attention so I can survive my commute; the CUDA stuff is mostly because I think about it every waking moment. (Although I also usually have nightmares about it, too...) | 17:51:11 |