| 30 Sep 2025 |
Winter | 2025-09-30 15:11:37.818161: W external/local_xla/xla/service/gpu/llvm_gpu_backend/default/nvptx_libdevice_path.cc:40] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
...
/usr/local/cuda
/opt/cuda
/nix/store/d2b95k4ysi7822hnxq72np5vvfq7wbbp-python3.12-tensorflow-gpu-2.19.0/lib/python3.12/site-packages/tensorflow/../nvidia/cuda_nvcc
/nix/store/d2b95k4ysi7822hnxq72np5vvfq7wbbp-python3.12-tensorflow-gpu-2.19.0/lib/python3.12/site-packages/tensorflow/../../nvidia/cuda_nvcc
/nix/store/d2b95k4ysi7822hnxq72np5vvfq7wbbp-python3.12-tensorflow-gpu-2.19.0/lib/python3.12/site-packages/tensorflow/cuda
/nix/store/d2b95k4ysi7822hnxq72np5vvfq7wbbp-python3.12-tensorflow-gpu-2.19.0/lib/python3.12/site-packages/tensorflow/../../../..
/nix/store/d2b95k4ysi7822hnxq72np5vvfq7wbbp-python3.12-tensorflow-gpu-2.19.0/lib/python3.12/site-packages/tensorflow/../../../../..
anyone ever see smth like this before? | 19:14:29 |
Winter | grepped around existing issues/prs a bit but no dice | 19:14:57 |
| 1 Oct 2025 |
connor (burnt/out) (UTC-8) | nvvm is a subdirectory of cuda_nvcc pre-CUDA 13.0; I don’t remember which output it’s in though. Seems like the error is mostly about being unable to find that. | 01:03:41 |
SomeoneSerge (back on matrix) | I think we put some config next to bin/nvcc that points at the correct libdevice location? Used to be in the overrides | 15:46:38 |
Kevin Mittman (UTC-7) | In CUDA 13, there was a split from cuda_nvcc to cuda_crt, libnvvm, and libnvptxcompiler components | 15:47:11 |
| Kevin Mittman (UTC-7) changed their display name from Kevin Mittman (UTC+9) to Kevin Mittman (UTC-7). | 15:48:07 |
| @magic_rb:matrix.redalder.org left the room. | 18:23:33 |
Winter | this is just using pythonPackages.tensorflow w/ config.cudaSupport on 25.05 -- so that's CUDA 12, right? | 19:12:26 |
Winter | dunno why this would be sad then | 19:12:38 |
SomeoneSerge (back on matrix) | At what stage do you get that error? | 19:20:29 |
Winter | runtime, after i've successfully imported tf | 20:09:22 |
Winter | * runtime, after i've successfully imported tf -- i then get some JIT compilation error | 20:15:31 |
Winter | seems like XLA_FLAGS may be my friend here | 20:16:43 |
Winter | ok yeah setting XLA_FLAGS=--xla_gpu_cuda_data_dir=/nix/store/y16bl5h9nxdbyfs922x4bz9lkk51kx1d-cuda_nvcc-12.8.93 fixed it! | 20:20:09 |
Winter | i think this is just a simple fix in tensorflow-bin | 20:20:19 |
Winter | i don’t particularly want to build tf-bin again but i’m going to get a PR up to fix this :-) | 21:57:28 |
Winter | i assume we have the concept of a check for whatever cuda version is in use — connor (he/him) (UTC-7) do you want a check for <13 before 13 is actually in, as a reminder of sorts to test this out with 13? (still need an MRE tho) | 21:58:24 |
Winter | for clarity this is some JIT’d XLA stuff | 21:59:17 |
connor (burnt/out) (UTC-8) | IIRC on 13 I symlink it so it should still be available in NVCC’s bin output | 22:38:35 |
Winter | awesome | 22:39:52 |
| 2 Oct 2025 |
Daniel Fahey | Hello there. Anyone got any spare time (and compute, lol) to help me upgrade vLLM in Nixpkgs? https://github.com/NixOS/nixpkgs/pull/447722 | 09:46:34 |
Gaétan Lepage | Sure! | 12:22:51 |
Gaétan Lepage | I pushed to the branch and fixed the CUDA build (hopefully) | 12:37:39 |
Gaétan Lepage | Well, unfortunately:
/nix/store/ycslb9zwv05yjp79ysvbwflwa9s9ffa8-source/include/cutlass/epilogue/collective/detail.hpp(719): error: no instance of function template "cutlass::epilogue::collective::detail::Sm100TmaWarpSpecializedAdapter<EpilogueOp>::operator()
| 12:41:39 |
Daniel Fahey | Legend! Beware I just force pushed | 12:56:59 |
Daniel Fahey | (to fix a typo in the commit message) | 12:57:07 |
Gaétan Lepage | Pushed new fixes. I initially had the wrong version of cutlass | 12:57:12 |
Daniel Fahey | woops, could you add you co-authorship again please | 12:58:19 |
Daniel Fahey | lost it when I did a git commit --amend | 12:58:37 |
Gaétan Lepage | I might have force-pushed right after you | 13:04:07 |