| 24 Sep 2025 |
Winter | disregard | 16:06:29 |
Winter | computers are downright evil | 16:06:34 |
Winter | (the library it's pointing to isn't actually the one it's loading!) | 16:06:50 |
apyh | is there a server for pytorch stuff specifically, or is this as close as it gets? really struggling to get torch.compile working :/ | 16:38:53 |
Robbie Buxton | What error are you running into apyh? | 16:42:44 |
Duncan Gammie | apyh: you'll probably get the fastest answer to that here if you provide specific error messages here: https://discuss.pytorch.org/c/compile/41 | 18:30:20 |
apyh | In reply to @sporeray:matrix.org What error are you running into apyh? well, torch's .compile functionality requires a bunch of stuff that isn't provided in its nix derivation - needs gcc at runtime, it reads an /etc/passwd file to pick a cache directory, etc - so it doesn't work out of the box thru it's nixpkgs stuff | 18:50:26 |
apyh | was just wondering if there was like a torch-nix chat outside here | 18:51:40 |
Robbie Buxton | Ah I’ve recently fixed the gcc iisue locally, I was planning to put a pr in upstream this week. | 18:58:56 |
Robbie Buxton | * Ah I’ve recently fixed the gcc issue locally, I was planning to put a pr in upstream this week. | 18:59:05 |
apyh | you will, for CUDA, also need to set TRITON_LIBCUDA_PATH - it normally tries to find it with ldconfig | 20:09:52 |
Robbie Buxton | How are you providing your cuda kernel libraries, are you on NixOS or a different distribution? | 20:15:01 |
Robbie Buxton | I.e where are you getting libcuda.so from? | 20:15:42 |
apyh | I'm in a docker container 😅 | 21:17:32 |
apyh | so i just point to /lib64/libcuda.so | 21:17:44 |
Robbie Buxton | Nix expects it in /run/opengl-driver/lib | 21:18:49 |
apyh | ah yeah I use nix-gl-host | 21:20:30 |
apyh | for all that | 21:20:31 |
Robbie Buxton | I’m not sure what the recommended way of doing that is these days but I symlink in all the required libraries to that path | 21:20:32 |
Robbie Buxton | I’m confused why triton is struggling to find cuda tho | 21:21:01 |
apyh | ok cool yeah then if you PR the missing gcc I'll see what else is missing in my setup :) | 21:21:09 |
apyh | In reply to @sporeray:matrix.org I’m confused why triton is struggling to find cuda tho triton runs /sbin/ldconfig to find it | 21:21:32 |
apyh | which doesn't exist under nixos / nix built docker images | 21:21:47 |
Robbie Buxton | I run triton on non NixOS and haven’t had issues | 21:22:07 |
Robbie Buxton | Do you have a minimum repro or is it literally failing on import | 21:22:22 |
apyh | like, calling torch.compile() fails | 21:22:49 |
apyh | sec | 21:22:50 |
Robbie Buxton | Maybe we should start a private chat to avoid spamming | 21:23:17 |
SomeoneSerge (back on matrix) | I presume it's the -bin version of tensorflow, which means it's patchelfed to use whatever versions of things we've got and inshallah it happens to work? | 22:43:11 |
SomeoneSerge (back on matrix) | Didn't we recently change triton so it doesn't retain the reference to gcc? | 22:44:50 |