| 7 Oct 2024 |
aidalgol | Does whatever nvidia compiler it's using have an equivalent to -isystem for gcc? | 21:16:18 |
Gaétan Lepage | Yes you are right. In the meantime, I found out that cuda_fp16.h is provided by cudaPackaged.cuda_cudart.dev | 21:18:16 |
Gaétan Lepage | The issue is that the way they invoke the compiler is a bit obscure: https://github.com/search?q=repo%3Atinygrad%2Ftinygrad%20nvrtcGetCUBIN&type=code | 21:19:30 |
Gaétan Lepage | I think that the closest examples within nixpkgs are cupy and numba.
But both of them operate this a bit differently. | 21:20:06 |
| 8 Oct 2024 |
| @pascal.grosmann:scs.ems.host left the room. | 10:55:06 |
connor (he/him) | From what I’ve seen in the Python ecosystem, compiling kernels at runtime is becoming more commonplace because it reduces the size of binaries you ship and allows optimizing for the hardware you’re specifically running on. For example, JAX (via XLA) support auto tuning via Triton by compiling and running a number of different kernels. | 15:46:17 |
Gaétan Lepage | Yes, compiling on the fly is the core spirit of tinygrad. | 15:47:06 |
SomeoneSerge (matrix works sometimes) | Trying to compose backendStdenv with ccacheStdenv 🙃 | 17:07:51 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net Trying to compose backendStdenv with ccacheStdenv 🙃 callPackage is a blessing and a curse | 17:50:29 |
SomeoneSerge (matrix works sometimes) | It works with a bit of copypaste though | 17:50:43 |
SomeoneSerge (matrix works sometimes) | But has anyone run into weird PermissionDenied errors with ccache? the directory is visible in the sandbox and owned by nixbld group and id seemst o match... | 17:57:47 |
| kaya 𖤐 changed their profile picture. | 19:36:06 |
| 9 Oct 2024 |
| john joined the room. | 01:20:41 |
| 10 Oct 2024 |
SomeoneSerge (matrix works sometimes) | Iterating on triton with ccache is so much faster lmao | 16:12:34 |