| 1 Nov 2025 |
@eveeifyeve:matrix.org | It's complex.h sorry.. | 06:00:31 |
@eveeifyeve:matrix.org | It's there, but idk something werid about pytorch package. | 06:01:30 |
@eveeifyeve:matrix.org | https://github.com/NixOS/nixpkgs/issues/457456 | 06:02:36 |
Gaétan Lepage | In reply to @eveeifyeve:matrix.org A good question does cuda_cccl include thrust/context.h because with hydra it's receiving a lot of errors saying it's missing for pytorch. https://hydra.nixos.org/build/311450648/log I am aware that torchWithRocm is broken. I mentioned it to @lt1379:matrix.org a few days ago. | 09:02:40 |
Gaétan Lepage | The regular pytorch builds fine AFAIK | 09:03:36 |
| leona joined the room. | 09:15:42 |
Lun | Tracked it down to USE_FBGEMM_GENAI doing some hacky things with hip flags, will have a PR up soon if that was the only build issue.
(Just turning it off, it's some specialized quantized kernels for one ISA only right now so really doesn't seem worth trying to fix the jank) | 18:55:34 |
| Gaétan Lepage changed their profile picture. | 22:52:58 |
| Gaétan Lepage changed their profile picture. | 22:54:00 |
| @eveeifyeve:matrix.org left the room. | 23:19:24 |
| 2 Nov 2025 |
| connor (burnt/out) (UTC-8) changed their display name from connor (burnt/out) (UTC-7) to connor (burnt/out) (UTC-8). | 08:13:06 |
Gaétan Lepage | RE cuda_nvcc leaking into nccl:
As a sidenote, I realized that one could remove cuda_nvcc from nativeBuildInputs and (getInclude cuda_nvcc) from buildInputs without breaking nccl's build.
This most probably works because of the makeFlags.
Unfortunately, this does not help with the leakage. | 10:58:02 |
| felix joined the room. | 14:30:39 |
connor (burnt/out) (UTC-8) | Gaétan Lepageis https://github.com/NixOS/nixpkgs/pull/457803 ready to merge? I’ll approve and merge if so | 15:15:24 |
Gaétan Lepage | It fixes the leak for nccl, but firefox gets gcc-wrapper from onnxruntime too. | 16:42:16 |
Gaétan Lepage | I'm about to push a commit that handles that too. I'm compiling rn. | 16:42:28 |
Gaétan Lepage | Rebuilt onnxruntime. It now doesn't depend on cuda_nvcc at runtime.
I'm now rebuilding firefox which should not have cuda_nvcc in its closure anymore. | 17:07:36 |
Gaétan Lepage | 😭 cudaPackages.cuda_cudart depends on cudaPackages.cuda_nvcc at runtime too!!!
Not because of a path leak in the binary this time, just because nvcc is in cudart's propagatedBuildInputs (I think?)
❯ nix why-depends --precise $(nom-build --arg config '{ allowUnfree = true; cudaSupport = true; }' -A firefox-unwrapped) $(nom-build --arg config '{ allowUnfree = true; cudaSupport = true; }' -A cudaPackages.cuda_nvcc)
Finished at 18:16:53 after 1s
Finished at 18:16:53 after 0s
/nix/store/yy1z5y3iql9r4kpslxnjdwcygx52ssl8-firefox-unwrapped-144.0.2
└───lib/firefox/libonnxruntime.so: …st be specified....../nix/store/jk4a7v44fc83ykc15b31r4m21yqc92sp-onnxruntime-1.22.2/lib/.....onn…
→ /nix/store/jk4a7v44fc83ykc15b31r4m21yqc92sp-onnxruntime-1.22.2
└───lib/libonnxruntime_providers_cuda.so: …nn-9.13.0.50-lib/lib:/nix/store/80x699lyc99dahf85iqdv6z1f0vv6vz2-cuda12.8-cuda_cudart-12.8.90/li…
→ /nix/store/80x699lyc99dahf85iqdv6z1f0vv6vz2-cuda12.8-cuda_cudart-12.8.90
└───nix-support/propagated-build-inputs: …fhjm-setup-cuda-hook /nix/store/ygd3s9zm1pf77n3q3ac63v58www5scbc-cuda12.8-cuda_nvcc-12.8.93 /nix…
→ /nix/store/ygd3s9zm1pf77n3q3ac63v58www5scbc-cuda12.8-cuda_nvcc-12.8.93
| 18:19:31 |
Gaétan Lepage | Actually, rebasing my PR on top of [SomeoneSerge (back on matrix)'s](https://github.com/NixOS/nixpkgs/pull/457424) worked! | 20:15:56 |
Gaétan Lepage | * Actually, rebasing my PR on top of Serge's worked! | 20:16:12 |
| 3 Nov 2025 |
connor (burnt/out) (UTC-8) | Are they good to go or do they need more testing? | 00:25:39 |
Gaétan Lepage | According to me, they are both good to go.
Let's wait for SomeoneSerge (back on matrix)'s ACK just to be sure. | 00:26:12 |
connor (burnt/out) (UTC-8) | Thank you both for working on that | 00:26:26 |
Gaétan Lepage | But I confirm that firefox builds fine (no gcc-wrapper triggering disallowedRequisited) with both PRs applied. | 00:26:58 |
Daniel Fahey | CUDA refactor victim fix https://github.com/NixOS/nixpkgs/pull/457870 ready to merge | 13:09:11 |
| Collin Arnett changed their profile picture. | 15:23:43 |
Ari Lotter | is this a horrible idea, if i need cuda support and don't want to wait hours for builds? :)
(final: prev: {
python312Packages = prev.python312Packages.override {
overrides = pyfinal: pyprev: {
torch = pyfinal.torch-bin;
};
};
})
| 21:28:33 |
Gaétan Lepage | RE {cudaPackages.nccl, onnxruntime}: remove reference to nvcc in binary:
We need to patch both nccl's libnccl.so and onnxruntime's libonnxruntime_providers_cuda.so for the fix to actually work. | 23:10:06 |
| 4 Nov 2025 |
connor (burnt/out) (UTC-8) | should be fine, but I'd always recommend using pythonPackagesExtensions since it's a little nicer to use | 06:38:57 |
SomeoneSerge (back on matrix) | I still have no explanation for why we cannot seem to reproduce the nvcc reference with saxpy | 15:07:47 |