| 1 Nov 2025 |
connor (burnt/out) (UTC-8) | I just looked through the commits for the CUDA 13 PR; I didn't see removal of nvcc from buildInputs anywhere | 04:13:19 |
SomeoneSerge (back on matrix) | Aight this may well be my favourite Halloween story so far, and still open-ended: https://github.com/NixOS/nixpkgs/pull/457424#issuecomment-3475742510 | 04:17:14 |
SomeoneSerge (back on matrix) | I can't explain this | 04:19:45 |
SomeoneSerge (back on matrix) | No way | 04:19:48 |
connor (burnt/out) (UTC-8) | can you imagine if it's non-deterministic | 04:24:09 |
connor (burnt/out) (UTC-8) | or only occurs if there are multiple capabilities | 04:24:16 |
connor (burnt/out) (UTC-8) | I mean I was able to reproduce with a single capability | 04:24:54 |
SomeoneSerge (back on matrix) | Yeah. If I were to try to explain it, I'd start checking if they e.g. try to copy CUDAToolkit_INCLUDE_DIRS or any other FindCUDAToolkit's targets' properties into their own targets | 04:31:09 |
connor (burnt/out) (UTC-8) | For what it's worth, I looked through the tree and here are the packages I found where cuda_nvcc ends up in buildInputs in some way. This list doesn't include other way it could end up in the closure -- string interpolation, environment variables, cmakeFlags, etc.
- ./pkgs/by-name/ba/basalt-monado/package.nix
- ./pkgs/by-name/dl/dlib/package.nix
- ./pkgs/by-name/fr/frei0r/package.nix
- ./pkgs/by-name/gp/gpu-burn/package.nix
- ./pkgs/by-name/ka/katago/package.nix
- ./pkgs/by-name/ko/koboldcpp/package.nix
- ./pkgs/by-name/ma/mathematica/generic.nix
- ./pkgs/by-name/mf/mfaktc/package.nix
- ./pkgs/by-name/mo/monado/package.nix
- ./pkgs/by-name/xm/xmrig-cuda/package.nix
- ./pkgs/by-name/xm/xmrig-cuda-mo/package.nix
- ./pkgs/development/cuda-modules/packages/cuda_cudart.nix
- ./pkgs/development/cuda-modules/packages/nccl.nix
- ./pkgs/development/libraries/ffmpeg/generic.nix
- ./pkgs/development/python-modules/causal-conv1d/default.nix
- ./pkgs/development/python-modules/cupy/default.nix
- ./pkgs/development/python-modules/jaxlib/default.nix
- ./pkgs/development/python-modules/lightgbm/default.nix
- ./pkgs/development/python-modules/mamba-ssm/default.nix
- ./pkgs/development/python-modules/tensorflow/default.nix
- ./pkgs/development/python-modules/torch/source/default.nix
- ./pkgs/development/python-modules/warp-lang/default.nix
| 04:32:50 |
| @eveeifyeve:matrix.org joined the room. | 04:47:07 |
@eveeifyeve:matrix.org | A good question does cuda_cccl include thrust/context.h because with hydra it's receiving a lot of errors saying it's missing for pytorch. https://hydra.nixos.org/build/311450648/log | 04:47:31 |
@eveeifyeve:matrix.org | I am trying to build pytorch right now. | 04:47:44 |
SomeoneSerge (back on matrix) | $ nix-build '<nixpkgs>' --arg config '{allowUnfree = true;}' -A cudaPackages.cuda_cccl.dev
$ readlink result-dev
/nix/store/yvvs83nys6i78fq1p5dgliqlhlk2svq0-cuda_cccl-12.8.90-dev
$ ls result-dev/include/thrust/context.h
ls: cannot access 'result-dev/include/thrust/context.h': No such file or directory
| 05:37:53 |
@eveeifyeve:matrix.org | This effects building pytorch. | 05:39:49 |
@eveeifyeve:matrix.org | * This effects building pytorch which is will be a major blocker for zero builds hydra 25.11. | 05:41:36 |
SomeoneSerge (back on matrix) | Perhaps Gaétan Lepage's looked into this? | 05:46:57 |
SomeoneSerge (back on matrix) | In any case, see you tomorrow | 05:47:21 |
@eveeifyeve:matrix.org | I might try to debug this. Because I am happy to. | 05:53:52 |
@eveeifyeve:matrix.org | It's complex.h sorry.. | 06:00:31 |
@eveeifyeve:matrix.org | It's there, but idk something werid about pytorch package. | 06:01:30 |
@eveeifyeve:matrix.org | https://github.com/NixOS/nixpkgs/issues/457456 | 06:02:36 |
Gaétan Lepage | In reply to @eveeifyeve:matrix.org A good question does cuda_cccl include thrust/context.h because with hydra it's receiving a lot of errors saying it's missing for pytorch. https://hydra.nixos.org/build/311450648/log I am aware that torchWithRocm is broken. I mentioned it to @lt1379:matrix.org a few days ago. | 09:02:40 |
Gaétan Lepage | The regular pytorch builds fine AFAIK | 09:03:36 |
| leona joined the room. | 09:15:42 |
Lun | Tracked it down to USE_FBGEMM_GENAI doing some hacky things with hip flags, will have a PR up soon if that was the only build issue.
(Just turning it off, it's some specialized quantized kernels for one ISA only right now so really doesn't seem worth trying to fix the jank) | 18:55:34 |
| Gaétan Lepage changed their profile picture. | 22:52:58 |
| Gaétan Lepage changed their profile picture. | 22:54:00 |
| @eveeifyeve:matrix.org left the room. | 23:19:24 |
| 2 Nov 2025 |
| connor (burnt/out) (UTC-8) changed their display name from connor (burnt/out) (UTC-7) to connor (burnt/out) (UTC-8). | 08:13:06 |
Gaétan Lepage | RE cuda_nvcc leaking into nccl:
As a sidenote, I realized that one could remove cuda_nvcc from nativeBuildInputs and (getInclude cuda_nvcc) from buildInputs without breaking nccl's build.
This most probably works because of the makeFlags.
Unfortunately, this does not help with the leakage. | 10:58:02 |