| 30 Dec 2024 |
connor (he/him) | Well, messing around with Triton compiler failure on pytorch, you know the good ol' error
torch._inductor.exc.InductorError: FileNotFoundError: [Errno 2] No such file or directory: '/sbin/ldconfig'
seems similar to what Serge pointed out here https://github.com/NixOS/nixpkgs/pull/278969/files#diff-289748b7fbff3ff07ecd17030035a7e7aa78b21e882a549900885e6bc5030973
| 18:41:06 |
connor (he/him) | Oh I got torch.compile working! | 21:02:45 |
connor (he/him) | I'll submit a PR for Nixpkgs | 21:12:21 |
connor (he/him) | Oh wait, what's the proper way to expose a runtime dependency on libcuda.so? Is it enough to point it to the stub? Because (as I understand) that's only for linking, not for runtime use (because it's a stub). Since libcuda.so is provided by the driver, and library location depends on the host OS... | 21:15:03 |
connor (he/him) | I guess if we knew ahead of time where libcuda.so and the like were, we wouldn't need nixGL or nix-gl-host because we could package everything in a platform-agnostic way, huh... | 21:23:41 |
connor (he/him) | At any rate, here's https://github.com/NixOS/nixpkgs/pull/369495 | 21:24:07 |
| 31 Dec 2024 |
connor (he/him) | well, I was able to package https://github.com/NVIDIA/TransformerEngine for PyTorch
updated (locally) https://github.com/ConnorBaker/nix-cuda-test to verify I could train an FP8 model on my 4090 using the work I've done in https://github.com/connorbaker/cuda-packages and it seems to work | 09:49:28 |
| 🐰 xiaoxiangmoe joined the room. | 10:44:26 |
connor (he/him) | also packaging flash attention now because hopefully it supports fp8 training where PyTorch's implementation does not why does it require so much memory to build? What is NVCC doing? | 16:09:28 |
connor (he/him) |  Download Screenshot 2024-12-31 at 11.07.56 AM.png | 16:09:33 |
connor (he/him) | Well, that didn't work
[DEBUG | DotProductAttention]: Disabling FlashAttention as FlashAttention 2 does not support FP8
[DEBUG | DotProductAttention]: Disabling UnfusedDotProductAttention as it does not support FP8
[DEBUG | DotProductAttention]: Disabling FusedAttention as no backend supports the provided input
[DEBUG | DotProductAttention]: Available backends = {FlashAttention=False, FusedAttention=False, UnfusedDotProductAttention=False}
[DEBUG | DotProductAttention]: Selected backend = NoBackend
| 16:40:57 |
connor (he/him) | Looks like the authors of Flash Attention are looking at support for fp8 (which comes with v3, currently only available for Hopper) for Ada series per their rebuttals on their paper: https://openreview.net/forum?id=tVConYid20&referrer=%5Bthe+profile+of+Tri+Dao%5D%28%2Fprofile%3Fid%3D~Tri_Dao1%29 | 16:53:06 |
| kaya 𖤐 changed their profile picture. | 21:48:16 |
| 1 Jan 2025 |
connor (he/him) | pushed the changes I had locally for nix-cuda-test (https://github.com/ConnorBaker/nix-cuda-test), if anyone wants to play with transformer engine or flash attention. I'll probably work on upstreaming those at some indeterminate point in time, but I don't know if they'll work with what's in-tree right now | 03:59:10 |
connor (he/him) | * pushed the changes I had locally for nix-cuda-test (https://github.com/ConnorBaker/nix-cuda-test), if anyone wants to play with transformer engine or flash attention (both for PyTorch). I'll probably work on upstreaming those at some indeterminate point in time, but I don't know if they'll work with what's in-tree right now | 03:59:24 |
connor (he/him) | SomeoneSerge (utc+3): are you aware of a clean, cross-platform way to handle patching the path to libcuda.so (as needed in https://github.com/NixOS/nixpkgs/pull/369495#issuecomment-2566002172)? Is it fair to assume that on non-NixOS systems, whatever wrapper people use (like nixGL or nixglhost) will add libcuda.so to LD_LIBRARY_PATH? | 04:00:43 |
connor (he/him) | A new hope? https://www.phoronix.com/news/ZLUDA-v4-Released | 11:14:50 |
connor (he/him) | https://github.com/NixOS/nixpkgs/pull/369956 | 13:22:40 |
| NixOS Moderation Botchanged room power levels. | 14:26:32 |