| 28 May 2026 |
Gaétan Lepage | I warned you ;) | 11:44:54 |
Gaétan Lepage | Haha nice! Feel free to ping me for review or if you need some help | 11:45:58 |
BerriJ | It's still building, should I let it run, despite that error message?
Now its using 1 CPU core but I'm not constantly watching it, just occacionally I login and see if its finished. I can leave it running if its of any use for you, no worries :) | 11:59:55 |
Gaétan Lepage | If you don't mind, I'd be curious to see the final log, so yes. | 13:02:55 |
BerriJ | Here you go :)
error: Cannot build '/nix/store/dgxvh459r8lqlcpli60vjdahqm03ybn6-python3.13-flash-attention-2.8.3.drv'.
Reason: builder failed with exit code 1.
Output paths:
/nix/store/13nyvan7jp2abknfr56plx3fyn89zp7f-python3.13-flash-attention-2.8.3
/nix/store/f0z9q35rz8xh68q6qcnfiz6mjla8jdzl-python3.13-flash-attention-2.8.3-dist
Last 25 log lines:
> Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
> CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
> For debugging consider passing CUDA_LAUNCH_BLOCKING=1
> Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
> FAILED tests/test_rotary.py::test_rotary_emb_varlen_func[True-True-0.5-int-dtype1] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
> Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
> CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
> For debugging consider passing CUDA_LAUNCH_BLOCKING=1
> Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
> FAILED tests/test_rotary.py::test_rotary_emb_varlen_func[True-True-0.5-Tensor-dtype0] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
> Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
> CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
> For debugging consider passing CUDA_LAUNCH_BLOCKING=1
> Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
> FAILED tests/test_rotary.py::test_rotary_emb_varlen_func[True-True-0.5-Tensor-dtype1] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
> Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
> CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
> For debugging consider passing CUDA_LAUNCH_BLOCKING=1
> Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
> FAILED tests/test_rotary.py::test_compilation_count - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
> Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
> CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
> For debugging consider passing CUDA_LAUNCH_BLOCKING=1
> Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
> = 197981 failed, 155167 passed, 213305 skipped, 19 warnings in 72020.06s (20:00:20) =
For full logs, run:
nix log /nix/store/dgxvh459r8lqlcpli60vjdahqm03ybn6-python3.13-flash-attention-2.8.3.drv
The full log is 1.9GB, I can also share it with you if its useful
| 15:28:58 |
Gaétan Lepage | Wow. Interesting. Yes I would be curious to brows the full log if you don't mind sending it. | 16:05:55 |
BerriJ | I'll upload it tomorrow, I just turned off the PC 😇 | 16:11:09 |
Gaétan Lepage | I realized both onnxruntime and opencv fail to build with cudaPackages_13_2. | 19:44:38 |
Robbie Buxton | In reply to @glepage:matrix.org I realized both onnxruntime and opencv fail to build with cudaPackages_13_2. I think the onnxruntime tests are just bad | 19:52:04 |
Robbie Buxton | Unless you’re hitting a different issue | 19:52:21 |
Gaétan Lepage | No, it's a compilation error:
/nix/store/9bvirxizyiq3pg5rmgpz2f6aw6y7c1fm-cuda13.2-cuda_cccl-13.2.27/include/cub/device/device_transform.cuh:44:111: error: global qualification of class name is invalid before ':' token
44 | struct ::cuda::proclaims_copyable_arguments<CUB_NS_QUALIFIER::detail::__return_constant<T>> : ::cuda::std::true_type
| ^
/nix/store/9bvirxizyiq3pg5rmgpz2f6aw6y7c1fm-cuda13.2-cuda_cccl-13.2.27/include/cub/device/device_transform.cuh:44:111: error: expected '{' before ':' token
make[2]: *** [CMakeFiles/onnxruntime_providers_cuda.dir/build.make:8382: CMakeFiles/onnxruntime_providers_cuda.dir/build/source/onnxruntime/contrib_ops/cuda/math/bias_softmax_impl.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /nix/store/54yzskzwmapls2awvih149wyzci717gw-onnx-src/onnx/version_converter/convert.cc:5:
/nix/store/54yzskzwmapls2awvih149wyzci717gw-onnx-src/onnx/version_converter/convert.h: In constructor 'onnx::version_conversion::DefaultVersionConverter::DefaultVersionConverter()':
/nix/store/54yzskzwmapls2awvih149wyzci717gw-onnx-src/onnx/version_converter/convert.h:110:3: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
110 | DefaultVersionConverter() {
| ^~~~~~~~~~~~~~~~~~~~~~~
/nix/store/9bvirxizyiq3pg5rmgpz2f6aw6y7c1fm-cuda13.2-cuda_cccl-13.2.27/include/cub/device/device_transform.cuh:44:111: error: global qualification of class name is invalid before ':' token
44 | struct ::cuda::proclaims_copyable_arguments<CUB_NS_QUALIFIER::detail::__return_constant<T>> : ::cuda::std::true_type
| ^
/nix/store/9bvirxizyiq3pg5rmgpz2f6aw6y7c1fm-cuda13.2-cuda_cccl-13.2.27/include/cub/device/device_transform.cuh:44:111: error: expected '{' before ':' token
make[2]: *** [CMakeFiles/onnxruntime_providers_cuda.dir/build.make:8562: CMakeFiles/onnxruntime_providers_cuda.dir/build/source/onnxruntime/contrib_ops/cuda/moe/ft_moe/moe_kernel.cu.o] Error 1
| 19:57:26 |
Robbie Buxton | Oh I think I fixed this actually one sec | 20:01:17 |
Robbie Buxton | Try adding as a patch opencv/opencv_contrib pull request 4097 | 20:03:22 |
Robbie Buxton | Sorry for not sending a link on mobile | 20:03:32 |
Gaétan Lepage | Thanks so much. | 20:16:00 |
Robbie Buxton | I didn’t need to patch onnxruntime tho just disable to tests | 20:17:09 |
Robbie Buxton | Was running 1.24.4 which is in master | 20:17:25 |
Gaétan Lepage | It does not apply on 4.13.0 though... modules/cudev/include/opencv2/cudev/ptr2d/zip.hpp does not exist | 20:23:01 |
Robbie Buxton | Weird, it applies for me | 20:26:07 |
Gaétan Lepage | This one?
https://github.com/opencv/opencv_contrib/commit/f2854f4f5e7b67d4e073ea002ae0174d437e2962 | 20:27:11 |
Robbie Buxton | Yup as an overlay we add it to the patch list if cuda support is enabled | 20:28:04 |
Robbie Buxton | Using fetchPatch with stripLen2 with extraPrefix = opencv_contrib/ | 20:28:55 |
Robbie Buxton | * Using fetchPatch with stripLen = 2 with extraPrefix = opencv_contrib/ | 20:29:08 |
Gaétan Lepage | Indeed. You rock! | 20:31:40 |
Gaétan Lepage | https://github.com/NixOS/nixpkgs/pull/525342 | 20:38:41 |
Gaétan Lepage | Fixes onnxruntime too: https://github.com/NixOS/nixpkgs/pull/525369 | 22:11:56 |
| 29 May 2026 |
BerriJ | Here you go: https://uni-duisburg-essen.sciebo.de/s/r44i7DSJJWBoBag
The password is: 1234 | 04:11:59 |
Gaétan Lepage | Here's a good one 🥲🐸
https://github.com/NixOS/nixpkgs/pull/525504
cc Kevin Mittman (UTC-7) | 09:34:51 |
Gaétan Lepage | * Here's a good one 🥲🐸
https://github.com/NixOS/nixpkgs/pull/525504
cc Kevin Mittman (UTC-7)
(Maybe I'm just holding it wrong and I badly packaged apex) | 09:39:24 |
| 30 May 2026 |
Kevin Mittman (UTC-7) | Yeah I wish it was a consistent interface, have run into similar elsewhere | 02:13:22 |