| 24 Sep 2025 |
Robbie Buxton | How are you providing your cuda kernel libraries, are you on NixOS or a different distribution? | 20:15:01 |
Robbie Buxton | I.e where are you getting libcuda.so from? | 20:15:42 |
apyh | I'm in a docker container 😅 | 21:17:32 |
apyh | so i just point to /lib64/libcuda.so | 21:17:44 |
Robbie Buxton | Nix expects it in /run/opengl-driver/lib | 21:18:49 |
apyh | ah yeah I use nix-gl-host | 21:20:30 |
apyh | for all that | 21:20:31 |
Robbie Buxton | I’m not sure what the recommended way of doing that is these days but I symlink in all the required libraries to that path | 21:20:32 |
Robbie Buxton | I’m confused why triton is struggling to find cuda tho | 21:21:01 |
apyh | ok cool yeah then if you PR the missing gcc I'll see what else is missing in my setup :) | 21:21:09 |
apyh | In reply to @sporeray:matrix.org I’m confused why triton is struggling to find cuda tho triton runs /sbin/ldconfig to find it | 21:21:32 |
apyh | which doesn't exist under nixos / nix built docker images | 21:21:47 |
Robbie Buxton | I run triton on non NixOS and haven’t had issues | 21:22:07 |
Robbie Buxton | Do you have a minimum repro or is it literally failing on import | 21:22:22 |
apyh | like, calling torch.compile() fails | 21:22:49 |
apyh | sec | 21:22:50 |
Robbie Buxton | Maybe we should start a private chat to avoid spamming | 21:23:17 |
SomeoneSerge (back on matrix) | I presume it's the -bin version of tensorflow, which means it's patchelfed to use whatever versions of things we've got and inshallah it happens to work? | 22:43:11 |
SomeoneSerge (back on matrix) | Didn't we recently change triton so it doesn't retain the reference to gcc? | 22:44:50 |
SomeoneSerge (back on matrix) | all of this should be patched out, maybe it slipped back in during updates | 22:46:17 |
Robbie Buxton | In reply to @ss:someonex.net Didn't we recently change triton so it doesn't retain the reference to gcc? This is a different flow that goes via _inductor/config.py, I’ll put the pr in tomorrow once torch finishes building | 22:47:41 |
| 25 Sep 2025 |
Winter | this is python3Packages.tensorflow | 14:51:03 |
Winter | but it appears that the _pywrap_... it's looking at is different because I have some stuff in LDLIBRARYPATH lol | 14:51:42 |
Winter | but it appears that the _pywrap_... it's looking at is different because I have some stuff in LD_LIBRARY_PATH lol | 14:51:49 |
Winter | i should dig into it at some point | 14:52:06 |
Winter | out of curiosity/insanity? | 14:52:13 |
Winter | * out of curiosity/insanity? (both?) | 14:52:23 |
connor (he/him) | To clarify, are you building PyTorch and Triton with Nix and producing a container, or is this something else? I’d also like to see a reproducer of you have one handy | 15:07:20 |
connor (he/him) | here be the path to (more) madness | 15:08:12 |
apyh | In reply to @connorbaker:matrix.org To clarify, are you building PyTorch and Triton with Nix and producing a container, or is this something else? I’d also like to see a reproducer of you have one handy I'm building a container that includes PyTorch as a runtime dependency, and that PyTorch code calls torch.compile. Let me try to create a minimal repro :) | 15:30:19 |