| 6 Feb 2026 |
Gaétan Lepage | Indeed, I don't think so. | 22:35:43 |
| 8 Feb 2026 |
hexa (UTC+1) | where can I find libnvidia-ml.so.1? | 03:09:43 |
hexa (UTC+1) | * where can I find libnvidia-ml.so.1 used by py3nvml? | 03:09:50 |
hexa (UTC+1) | nvm … /nix/store/9g9zb0r0hk63fm1xq8582bgjd8d69k0k-nvidia-x11-580.119.02-6.12.68/lib/libnvidia-ml.so.1 | 03:10:49 |
Robbie Buxton | In reply to @hexa:lossy.network where can I find libnvidia-ml.so.1? This is a nvidia kernel library so if you aren’t on nixos you need to get it from where you install it on the host | 03:37:00 |
Robbie Buxton | But looks like you found it! | 03:37:12 |
hexa (UTC+1) | it is below the driverLink path | 03:38:20 |
Robbie Buxton | Yeah on nixos iirc it’s symlinked into /run/opengl-driver/lib if I’m not mistaken | 03:39:39 |
hexa (UTC+1) | correct | 03:40:14 |
hexa (UTC+1) | addDriverRunpath.driverLink is the relevant attribute | 03:40:24 |
| kaya 𖤐 changed their profile picture. | 22:50:15 |
Gaétan Lepage | After some testing, our current torch version (2.9.0) does build against cuda 13.0, but not cuda 13.1:
/nix/store/42f8i6v4gfkvdimy9aczwqik3scl6dpw-cuda13.1-cuda_cccl-13.1.115/include/cub/device/dispatch/dispatch_radix_sort.cuh(1425): error: no operator "+=" matches these operands
operand types are: at::native::<unnamed>::offset_t += const int64_t
end_offsets_current_it += num_current_segments;
Context: https://github.com/NixOS/nixpkgs/pull/486717 | 23:01:20 |
Gaétan Lepage | I'll try to ship torch 2.10.0 ASAP, hoping that it is compatible with cuda 13.1 (which should unfortunately * | 23:02:38 |
Gaétan Lepage | * I'll try to ship torch 2.10.0 ASAP, hoping that it is compatible with cuda 13.1 (which should unfortunately not be the case). | 23:02:57 |
| @niten:fudo.im left the room. | 23:07:13 |
| 9 Feb 2026 |
Benjamin Isbarn | I'm not using any overlay for that purpose right now. Good point regarding the global override, will do that ;). So the cudaCapabilities would affect packages like the cudart, cublas etc. I guess i.e. what features it will consider available and thus use? the in theory this should yield better performance for the aforementioned libraries? | 07:03:05 |
connor (burnt/out) (UTC-8) | Read https://nixos.org/manual/nixpkgs/stable/#cuda -- Jetson isn't built by default and pre-thor uses different binaries so you need to make sure cudaCapabilities is set correctly; you'll get faster builds, smaller closures, and (possibly) better performance if you specify the exact capability | 07:36:40 |
| SolitudeAlma joined the room. | 07:49:25 |
Gaétan Lepage | VLLM is now 0.15.1 (latest version) | 11:11:17 |
cameronraysmith | SomeoneSerge (back on matrix): let me know if the updates to https://github.com/NixOS/nixpkgs/pull/488199 captured what you suggested. No rush: thanks! | 21:03:44 |
Gaétan Lepage | connor (burnt/out) (UTC-8) SomeoneSerge (back on matrix)
This PR should fix the last failing gpuCheck instance, i.e. python3Packages.triton.gpuCheck: https://github.com/NixOS/nixpkgs/pull/488887
I discovered one of our beloved dlopen instance in triton. We didn't know about it since then... This PR fixes it too. | 23:39:42 |
Gaétan Lepage | * connor (burnt/out) (UTC-8) SomeoneSerge (back on matrix)
This PR should fix the last failing gpuCheck instance, i.e. python3Packages.triton.gpuCheck: https://github.com/NixOS/nixpkgs/pull/488887
I discovered one of our beloved dlopen instances in triton. We didn't know about it since then... This PR fixes it too. | 23:55:34 |
| 10 Feb 2026 |
connor (burnt/out) (UTC-8) | Don't think I linked it here, maybe interesting for people with heavy eval jobs: https://github.com/ConnorBaker/nix-optimization | 01:41:36 |
| 11 Feb 2026 |
connor (burnt/out) (UTC-8) | Gaétan Lepage: there's a merge conflict and I need to rebase, but IIRC something like https://github.com/NixOS/nixpkgs/pull/485208 is necessary to make CUDA 13 the default. I still need to do the same for the PyCuda PR I have: https://github.com/NixOS/nixpkgs/pull/465047. Apologies that's taking me so long. | 18:56:11 |
| 12 Feb 2026 |
Gaétan Lepage | Ok thanks! I should get a notification when you'll have rebased. | 07:52:34 |
| 13 Feb 2026 |
| hoplopf joined the room. | 10:21:48 |
| 4 Aug 2022 |
| Winter (she/her) joined the room. | 03:26:42 |
Winter (she/her) | (hi, just came here to read + respond to this.) | 03:28:52 |
tpw_rules | hey. i had previously sympathzied with samuela and like i said before had some of the same frustrations. i just edited my github comment to add "[CUDA] packages are universally complicated, fragile to package, and critical to daily operations. Nix being able to manage them is unbelievably helpful to those of us who work with them regularly, even if support is downgraded to only having an expectation of function on stable branches." | 03:29:14 |
Winter (she/her) | In reply to @tpw_rules:matrix.org i'm mildly peeved about a recent merging of something i maintain where i'm pretty sure the merger does not own the expensive hardware required to properly test the package. i don't think it broke anything but i was given precisely 45 minutes to see the notification before somebody merged it ugh, 45 minutes? that's... not great. not to air dirty laundry but did you do what samuela did in the wandb PR and at least say that that wasn't a great thing to do? (not sure how else to word that, you get what i mean) | 03:30:23 |