| 10 Apr 2025 |
connor (burnt/out) (UTC-8) | X86 has a cuda_compat library too from what I remember, it’s just not available as a redist
So maybe we shouldn’t package the one for Jetsons
And instead, nixglhost should use the one on the host system if it is available | 16:36:17 |
connor (burnt/out) (UTC-8) | Although that won’t us on NixOS systems — cuda_compat is usually provided as a Debian with newer releases of CUDA, so it would just fail to run on NixOS systems if the driver isn’t new enough | 16:37:40 |
connor (burnt/out) (UTC-8) | * Although that won’t help us on NixOS systems — cuda_compat is usually provided as a Debian with newer releases of CUDA, so it would just fail to run on NixOS systems if the driver isn’t new enough | 16:37:48 |
SomeoneSerge (back on matrix) |
So maybe we shouldn’t package the one for Jetsons
No, I think whenever it's available we'd rather do the pure linking, because that's what we do to other libraries. This is in general a tradeoff, and it would have been great if we had tools for quickly relinking stuff/tools for building stuff against reproducible content-addressed stubs with a separate linking phase, but that's not where we are
| 16:39:50 |
connor (burnt/out) (UTC-8) | Ugh
So on all platforms, we should only use cuda_compat if the host driver is old and we need forward compat
I guess the question is where cuda_compat should come from, if the decision to use it or not requires knowing what version the host driver is | 16:40:10 |
SomeoneSerge (back on matrix) | This is not different from the GL/vulkan situation | 16:41:29 |
connor (burnt/out) (UTC-8) | (Where it should come from meaning Nixpkgs and the runpath or from the host OS, which is a non-starter on NixOs systems since we don’t package it, although we could, but then for people to add it to their environment they’d need to rebuild ugh) | 16:41:35 |
connor (burnt/out) (UTC-8) | Oh? What’s that situation? | 16:42:43 |
SomeoneSerge (back on matrix) | The situation is we'd like to develop and link a dynamic shim (libglvnd-like) that can select the right thing at runtime (per the logic you wrote down) | 16:44:07 |
SomeoneSerge (back on matrix) | Nixpkgs breaks GL/Vulkan on NixOS when mixing revisions because we don't have this shim logic | 16:45:01 |
SomeoneSerge (back on matrix) | * Nixpkgs breaks GL/Vulkan on NixOS when mixing revisions because we don't have this shim | 16:45:23 |
SomeoneSerge (back on matrix) | Or, maybe, instead of the selection logic we need better isolation. I.e. libcapsule | 16:49:41 |
SomeoneSerge (back on matrix) | In either case we need to learn to mimic another library's interface. This exists for GL and afaict we (nixos) still don't know whether this actually works. I've no idea if we can do this to libcuda or libcudart, or whether that would be even legal | 16:51:01 |
| 11 Apr 2025 |
| saeedc joined the room. | 20:55:59 |
| 12 Apr 2025 |
| oak 🏳️🌈♥️ changed their display name from oak to oak - mikatammi.fi ÄÄNESTÄ. | 12:11:39 |
| oak 🏳️🌈♥️ changed their profile picture. | 12:13:37 |
| oak 🏳️🌈♥️ changed their display name from oak - mikatammi.fi ÄÄNESTÄ to oak - mikatammi.fi. | 12:56:11 |
| 13 Apr 2025 |
| ereslibre joined the room. | 11:43:29 |
ereslibre | Hi everyone! I am looking at a bug we have with CDI (Container Device Interface, for forwarding GPU's to containers): https://github.com/NixOS/nixpkgs/issues/397065
I think the user has a correct configuration (unless there are settings that were not mentioned in the issue), my main question is when using the datacenter driver, why the nvidia-container-toolkit is reporting:
ERRO[0000] failed to generate CDI spec: failed to create device CDI specs: failed to initialize NVML: ERROR_LIBRARY_NOT_FOUND
Do you have any idea on why NVML would not be present in this environment?
| 11:45:34 |
SomeoneSerge (back on matrix) | HI! I've a small announcement to make.
I've been failing badly to keep up with the backlog as a maintainer, even though I'm recently able to spend some more time on Nixpkgs&c. Working in occasional 1:1 meetings, otoh, has always felt comparatively productive. We've just had another call with Gaétan Lepage and I find it was nice, so I now want to try the following: https://md.someonex.net/s/9S4E00sIb#
This is not exactly "official", I'm not posting this e.g. on Discourse until I'm more confident, but as such it's an open invitation.
| 14:52:27 |
Gaétan Lepage | Indeed, it was great! We were able to finally finish fixint mistral-rs's cuda support! | 15:24:02 |
Gaétan Lepage | * Indeed, it was great! We were able to finally finish fixing mistral-rs's cuda support! | 15:24:22 |
| 15 Apr 2025 |
ereslibre | BTW folks, if you have a moment, I'd love to get this one merged: https://github.com/NixOS/nixpkgs/pull/367769 | 06:26:28 |