| 1 Nov 2025 |
SomeoneSerge (back on matrix) | Ok but the onnxruntime part I didn't look at/don't know don't understand | 02:23:57 |
SomeoneSerge (back on matrix) | My best guess is it's because an extraneous propagatedBuildInputs = [ setupCudaHook ] also slipped in into buildRedist.nix, but tbh I'm still struggling to explain how this leads to the extra reference | 02:51:56 |
SomeoneSerge (back on matrix) | Another person on gh saying they found a legit nvcc reference in libonnxruntime_providers_cuda, so I think this can't be explained by hooks alone | 02:56:09 |
connor (burnt/out) (UTC-8) | Is it possible onnxruntime does JIT compilation | 02:57:40 |
SomeoneSerge (back on matrix) | Yeah that what I'm wondering | 03:10:52 |
SomeoneSerge (back on matrix) | Nope | 03:12:13 |
SomeoneSerge (back on matrix) | More mysteries! https://github.com/NixOS/nixpkgs/pull/457424#issuecomment-3475537174 | 03:15:11 |
SomeoneSerge (back on matrix) | connor (burnt/out) (UTC-7): when you started propagating crt/host_config.h from cudart, did you also drop the manual nvcc from any leaf packages? Figure you might be quicker to answer than me reading or running tests | 03:58:51 |
connor (burnt/out) (UTC-8) | Manual NVCC? | 04:00:59 |
SomeoneSerge (back on matrix) | Like we had buildInputs = [ nvcc ] in a bunch of places because of the host_config.h dependency | 04:03:40 |
SomeoneSerge (back on matrix) | Which will not have been needed if cudart propagates it | 04:03:57 |
connor (burnt/out) (UTC-8) | I don’t remember honestly
I think I also had it propagate because headers need to be in buildInputs to be discovered | 04:05:08 |
connor (burnt/out) (UTC-8) | I just looked through the commits for the CUDA 13 PR; I didn't see removal of nvcc from buildInputs anywhere | 04:13:19 |
SomeoneSerge (back on matrix) | Aight this may well be my favourite Halloween story so far, and still open-ended: https://github.com/NixOS/nixpkgs/pull/457424#issuecomment-3475742510 | 04:17:14 |
SomeoneSerge (back on matrix) | I can't explain this | 04:19:45 |
SomeoneSerge (back on matrix) | No way | 04:19:48 |
connor (burnt/out) (UTC-8) | can you imagine if it's non-deterministic | 04:24:09 |
connor (burnt/out) (UTC-8) | or only occurs if there are multiple capabilities | 04:24:16 |
connor (burnt/out) (UTC-8) | I mean I was able to reproduce with a single capability | 04:24:54 |
SomeoneSerge (back on matrix) | Yeah. If I were to try to explain it, I'd start checking if they e.g. try to copy CUDAToolkit_INCLUDE_DIRS or any other FindCUDAToolkit's targets' properties into their own targets | 04:31:09 |
connor (burnt/out) (UTC-8) | For what it's worth, I looked through the tree and here are the packages I found where cuda_nvcc ends up in buildInputs in some way. This list doesn't include other way it could end up in the closure -- string interpolation, environment variables, cmakeFlags, etc.
- ./pkgs/by-name/ba/basalt-monado/package.nix
- ./pkgs/by-name/dl/dlib/package.nix
- ./pkgs/by-name/fr/frei0r/package.nix
- ./pkgs/by-name/gp/gpu-burn/package.nix
- ./pkgs/by-name/ka/katago/package.nix
- ./pkgs/by-name/ko/koboldcpp/package.nix
- ./pkgs/by-name/ma/mathematica/generic.nix
- ./pkgs/by-name/mf/mfaktc/package.nix
- ./pkgs/by-name/mo/monado/package.nix
- ./pkgs/by-name/xm/xmrig-cuda/package.nix
- ./pkgs/by-name/xm/xmrig-cuda-mo/package.nix
- ./pkgs/development/cuda-modules/packages/cuda_cudart.nix
- ./pkgs/development/cuda-modules/packages/nccl.nix
- ./pkgs/development/libraries/ffmpeg/generic.nix
- ./pkgs/development/python-modules/causal-conv1d/default.nix
- ./pkgs/development/python-modules/cupy/default.nix
- ./pkgs/development/python-modules/jaxlib/default.nix
- ./pkgs/development/python-modules/lightgbm/default.nix
- ./pkgs/development/python-modules/mamba-ssm/default.nix
- ./pkgs/development/python-modules/tensorflow/default.nix
- ./pkgs/development/python-modules/torch/source/default.nix
- ./pkgs/development/python-modules/warp-lang/default.nix
| 04:32:50 |
| @eveeifyeve:matrix.org joined the room. | 04:47:07 |
@eveeifyeve:matrix.org | A good question does cuda_cccl include thrust/context.h because with hydra it's receiving a lot of errors saying it's missing for pytorch. https://hydra.nixos.org/build/311450648/log | 04:47:31 |
@eveeifyeve:matrix.org | I am trying to build pytorch right now. | 04:47:44 |
SomeoneSerge (back on matrix) | $ nix-build '<nixpkgs>' --arg config '{allowUnfree = true;}' -A cudaPackages.cuda_cccl.dev
$ readlink result-dev
/nix/store/yvvs83nys6i78fq1p5dgliqlhlk2svq0-cuda_cccl-12.8.90-dev
$ ls result-dev/include/thrust/context.h
ls: cannot access 'result-dev/include/thrust/context.h': No such file or directory
| 05:37:53 |
@eveeifyeve:matrix.org | This effects building pytorch. | 05:39:49 |
@eveeifyeve:matrix.org | * This effects building pytorch which is will be a major blocker for zero builds hydra 25.11. | 05:41:36 |
SomeoneSerge (back on matrix) | Perhaps Gaétan Lepage's looked into this? | 05:46:57 |
SomeoneSerge (back on matrix) | In any case, see you tomorrow | 05:47:21 |
@eveeifyeve:matrix.org | I might try to debug this. Because I am happy to. | 05:53:52 |