| 12 Nov 2025 |
Daniel Fahey | Up to you, I'm in your debt already, I thought you were having fun messing about and learning | 17:53:34 |
Ari Lotter | oh yeah i meant like, to run review | 17:56:26 |
Ari Lotter | how big is this closure lmao | 17:56:30 |
Ari Lotter | im at 3gb downloaded so far | 17:56:34 |
Ari Lotter | oh there we go | 17:56:40 |
Ari Lotter | ok will try the review again heh | 17:57:13 |
Ari Lotter | huh. nixpkgs at b3d51a0365f6695e7dd5cdf3e180604530ed33b4 (based off nixos-unstable)` seems to have something broken with torch / cuda / cudnn... torch 2.8 worked fine, but 2.9 with my same application code throws
[cuda:0]: cuDNN Frontend error: No valid engine configs for Matmul_MUL_ADD_Reduction_SUB_EXP_Reduction_LOG_ADD_DIV_Matmul_
{"engineId":2,"smVersion":900,"knobChoices":{"CUDNN_KNOB_TYPE_KERNEL_CFG":7}}
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: Could not open libnvrtc at: this->libnvrtc == nullptr
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: compiler.load()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: rtk(kernelNumRunning)->loadDLL()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: status == CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: ptr->isSupported()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: finalize_internal()
{"engineId":1,"smVersion":900,"knobChoices":{"CUDNN_KNOB_TYPE_KERNEL_CFG":3}}
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: Could not open libnvrtc at: this->libnvrtc == nullptr
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: compiler.load()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: rtk(kernelNumRunning)->loadDLL()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: status == CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: ptr->isSupported()
Warning: CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING; Reason: finalize_internal()
| 17:58:30 |
Ari Lotter | missing nvrtc i guess.. | 17:58:40 |
Ari Lotter | gonna compare against my weird 2.9.0-nightly setup i was using before.. | 18:00:37 |
Daniel Fahey | Would need a reprex I think with the smallest bit of your application code that can trigger this | 18:40:46 |
Daniel Fahey | For the package maintainer(s) to look into it | 18:41:02 |
Ari Lotter | yeah workin on it 😓 | 18:41:34 |
Ari Lotter | it's ~100k lines of code i gotta narrow down heh | 18:41:46 |
connor (burnt/out) (UTC-8) | SomeoneSerge (back on matrix)Gaétan LepageI added you both as collaborators on my fork of Nixpkgs. I had a death in the family so I’m going to be in and out. We need to get this PR fixed up and merged: https://github.com/NixOS/nixpkgs/pull/456510
We also need to get this PR merged: https://github.com/NixOS/nixpkgs/pull/459416
I likely won’t have time to do that over the next few days. Can you two do that for me? | 19:12:06 |
leona | take care <3 | 20:40:59 |
Daniel Fahey | Condolences | 21:24:01 |
Ari Lotter | yeah, givin up - jax dies every time :( | 21:34:25 |
Daniel Fahey | haha, well done for trying | 22:19:37 |
Gaétan Lepage | I'm so sorry Connor. Take your time. Focus on what matters the most for you right now (i.e. not CUDA).
I'll look at those two PRs. Please message me if I can do anything else. | 23:04:38 |
Ari Lotter | yeaaah, this is somethin' real broken with the torch 2.9.0 update :/
gonna see if i can figure it out, but it just doesn't seem that nvrtc is in the run path. looking fora way to repro it without my whole binary 😠| 23:17:18 |
Gaétan Lepage | Redacted or Malformed Event | 23:29:07 |
Gaétan Lepage | torch 2.9.1 is out. And triton 3.5.1. And both staging-next and staging-nixos were merged into master a few hours ago.
CPUs will have to work hard for the next few days... | 23:29:21 |
| 13 Nov 2025 |
SomeoneSerge (back on matrix) | They come not single spies... really sorry to hear this, Connor. Take care | 14:07:48 |
Ari Lotter | oh woof, but torch-bin is 2.9.1 and torch is still 2.9.0 | 15:10:16 |
Ari Lotter | aha, repro'd :D | 16:15:41 |
Ari Lotter | https://github.com/NixOS/nixpkgs/issues/461334
issue opened :) | 18:54:11 |
| 14 Nov 2025 |
hexa (UTC+1) | https://hydra.nixos-cuda.org/build/14219 magma runs into the output limit | 04:50:01 |
hexa (UTC+1) | and https://hydra.nixos-cuda.org/jobset/nixos-cuda/cuda-packages-v2#tabs-jobs has no torch package 🤔 | 04:50:51 |
Gaétan Lepage | I increased it from 4GB (what nix-community has I think) to 8GB. And it seems to still be broken... | 08:53:41 |
Gaétan Lepage | This is very weird. It ends up being built anyway as a dependency. I'll try to investigate... | 08:55:38 |