!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

289 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
10 Apr 2025
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)(Where it should come from meaning Nixpkgs and the runpath or from the host OS, which is a non-starter on NixOs systems since we don’t package it, although we could, but then for people to add it to their environment they’d need to rebuild ugh)16:41:35
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Oh? What’s that situation?16:42:43
@ss:someonex.netSomeoneSerge (back on matrix)The situation is we'd like to develop and link a dynamic shim (libglvnd-like) that can select the right thing at runtime (per the logic you wrote down)16:44:07
@ss:someonex.netSomeoneSerge (back on matrix)Nixpkgs breaks GL/Vulkan on NixOS when mixing revisions because we don't have this shim logic16:45:01
@ss:someonex.netSomeoneSerge (back on matrix)* Nixpkgs breaks GL/Vulkan on NixOS when mixing revisions because we don't have this shim16:45:23
@ss:someonex.netSomeoneSerge (back on matrix)Or, maybe, instead of the selection logic we need better isolation. I.e. libcapsule16:49:41
@ss:someonex.netSomeoneSerge (back on matrix)In either case we need to learn to mimic another library's interface. This exists for GL and afaict we (nixos) still don't know whether this actually works. I've no idea if we can do this to libcuda or libcudart, or whether that would be even legal16:51:01
11 Apr 2025
@saeedc:matrix.orgsaeedc joined the room.20:55:59
12 Apr 2025
@oak:universumi.fioak 🏳️‍🌈♥️ changed their display name from oak to oak - mikatammi.fi ÄÄNESTÄ.12:11:39
@oak:universumi.fioak 🏳️‍🌈♥️ changed their profile picture.12:13:37
@oak:universumi.fioak 🏳️‍🌈♥️ changed their display name from oak - mikatammi.fi ÄÄNESTÄ to oak - mikatammi.fi.12:56:11
13 Apr 2025
@ereslibre:ereslibre.socialereslibre joined the room.11:43:29
@ereslibre:ereslibre.socialereslibre

Hi everyone! I am looking at a bug we have with CDI (Container Device Interface, for forwarding GPU's to containers): https://github.com/NixOS/nixpkgs/issues/397065

I think the user has a correct configuration (unless there are settings that were not mentioned in the issue), my main question is when using the datacenter driver, why the nvidia-container-toolkit is reporting:

ERRO[0000] failed to generate CDI spec: failed to create device CDI specs: failed to initialize NVML: ERROR_LIBRARY_NOT_FOUND

Do you have any idea on why NVML would not be present in this environment?

11:45:34
@ss:someonex.netSomeoneSerge (back on matrix)

HI! I've a small announcement to make.

I've been failing badly to keep up with the backlog as a maintainer, even though I'm recently able to spend some more time on Nixpkgs&c. Working in occasional 1:1 meetings, otoh, has always felt comparatively productive. We've just had another call with Gaétan Lepage and I find it was nice, so I now want to try the following: https://md.someonex.net/s/9S4E00sIb#

This is not exactly "official", I'm not posting this e.g. on Discourse until I'm more confident, but as such it's an open invitation.

14:52:27
@glepage:matrix.orgGaétan Lepage Indeed, it was great! We were able to finally finish fixint mistral-rs's cuda support! 15:24:02
@glepage:matrix.orgGaétan Lepage * Indeed, it was great! We were able to finally finish fixing mistral-rs's cuda support! 15:24:22
15 Apr 2025
@ereslibre:ereslibre.socialereslibreBTW folks, if you have a moment, I'd love to get this one merged: https://github.com/NixOS/nixpkgs/pull/36776906:26:28
@ss:someonex.netSomeoneSerge (back on matrix) connor (he/him) (UTC-7): did you use something like josh for cuda-legacy? I suspect this produced at least a few pings 😅 13:35:27
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I used https://github.com/newren/git-filter-repo — what would have pinged people?13:37:06
@ss:someonex.netSomeoneSerge (back on matrix)User handles in commit messages xD13:37:28
@ereslibre:ereslibre.socialereslibreHi! Given https://github.com/NixOS/nixpkgs/pull/362197 had conflicts recently due to the treewide formatting I closed it, and reopened it at https://github.com/NixOS/nixpkgs/pull/398993. I think we can merge this one too21:24:34
@ereslibre:ereslibre.socialereslibreWe have been going back and forth with the author for a while, and I thought it would be good to go ahead on our side21:25:10
@ereslibre:ereslibre.socialereslibreThanks!21:29:32
17 Apr 2025
@luke-skywalker:matrix.orgluke-skywalker joined the room.09:38:30
@luke-skywalker:matrix.orgluke-skywalker

is this the right place to ask questions / get pointes on how to properly setup cuda container toolkit?

For docker it seems to work when enabling deprecated enableNvidia = true; flag. However with neither nvidia-container-toolkit in systemPackages with or without hardware.nvidia-container-toolkit.enable = true; I cannot seem to get it to run...

11:01:34
@luke-skywalker:matrix.orgluke-skywalkerwas not lucky at all with containerd for k3s11:02:09
@luke-skywalker:matrix.orgluke-skywalkerfor anybody stumbling over this: I'm pretty sure im on the right track using CDIs, having it work with docker (& compose). Should have read the docs properly. The relevant section section from the nixOS CUDA docs that got me here was all the way at the bottom: https://nixos.wiki/wiki/Nvidia#NVIDIA%20Docker%20not%20Working 14:38:50
@luke-skywalker:matrix.orgluke-skywalkerfrom all I understand this gives a lot more flexibility to pass accelerators of different vendors to containerized workloads 🥳14:39:36
@ss:someonex.netSomeoneSerge (back on matrix) Yes, CDI is the supported way (and has received a lot of care from @ereslibre), enableNvidia relies on end-of-life runtime wrappers 16:18:38
@ss:someonex.netSomeoneSerge (back on matrix)

Should have read the docs properly. The relevant section section from

Did you manage to get containerd to work?

16:20:27

Show newer messages


Back to Room ListRoom Version: 9