| 7 Oct 2024 |
aidalgol | Does whatever nvidia compiler it's using have an equivalent to -isystem for gcc? | 21:16:18 |
Gaétan Lepage | Yes you are right. In the meantime, I found out that cuda_fp16.h is provided by cudaPackaged.cuda_cudart.dev | 21:18:16 |
Gaétan Lepage | The issue is that the way they invoke the compiler is a bit obscure: https://github.com/search?q=repo%3Atinygrad%2Ftinygrad%20nvrtcGetCUBIN&type=code | 21:19:30 |
Gaétan Lepage | I think that the closest examples within nixpkgs are cupy and numba.
But both of them operate this a bit differently. | 21:20:06 |
| 8 Oct 2024 |
| @pascal.grosmann:scs.ems.host left the room. | 10:55:06 |
connor (he/him) | From what I’ve seen in the Python ecosystem, compiling kernels at runtime is becoming more commonplace because it reduces the size of binaries you ship and allows optimizing for the hardware you’re specifically running on. For example, JAX (via XLA) support auto tuning via Triton by compiling and running a number of different kernels. | 15:46:17 |
Gaétan Lepage | Yes, compiling on the fly is the core spirit of tinygrad. | 15:47:06 |
SomeoneSerge (matrix works sometimes) | Trying to compose backendStdenv with ccacheStdenv 🙃 | 17:07:51 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net Trying to compose backendStdenv with ccacheStdenv 🙃 callPackage is a blessing and a curse | 17:50:29 |
SomeoneSerge (matrix works sometimes) | It works with a bit of copypaste though | 17:50:43 |
SomeoneSerge (matrix works sometimes) | But has anyone run into weird PermissionDenied errors with ccache? the directory is visible in the sandbox and owned by nixbld group and id seemst o match... | 17:57:47 |
| kaya 𖤐 changed their profile picture. | 19:36:06 |
| 9 Oct 2024 |
| john joined the room. | 01:20:41 |
| 10 Oct 2024 |
SomeoneSerge (matrix works sometimes) | Iterating on triton with ccache is so much faster lmao | 16:12:34 |
| 11 Oct 2024 |
Moritz Sanft | Hey folks! I tried to update libnvidia-container, as it was lacking quite some versions (including security releases) behind. We use it in a work scenario for GPU containers in legacy mode, where we tested it to "work" generally. Only thing that doesn't is the binary resolving (e.g. nvidia-smi, nvidia-persistenced, ...). I just adapted the patches from the old version so that they apply on the new one. I dropped the replacement of PATH usage for binary lookup with fixing it to the /run/nvidia-docker directory, as this seems to be an artifact of older times, I believe? At least, the path doesn't exist in a legacy mode container nor on the host. I think the binaries should really be looked up through the PATH, which should be set accordingly when calling nvidia-container-cli? What do the experts think?
CDI containers work, as the binary paths are resolved correctly through the CDI config generated at boot.
Find my draft PR here: https://github.com/NixOS/nixpkgs/pull/347867 | 07:49:12 |
Moritz Sanft | Hey folks! I tried to update libnvidia-container, as it was lacking quite some versions (including security releases) behind. We use it in a work scenario for GPU containers in legacy mode, where we tested it to "work" generally. Only thing that doesn't is the binary resolving (e.g. nvidia-smi, nvidia-persistenced, ...). I just adapted the patches from the old version so that they apply on the new one. I tried dropping the replacement of PATH usage for binary lookup with fixing it to the /run/nvidia-docker directory, as this seems to be an artifact of older times, I believe? At least, the path doesn't exist in a legacy mode container nor on the host. I think the binaries should really be looked up through the PATH, which should be set accordingly when calling nvidia-container-cli? What do the experts think?
CDI containers work, as the binary paths are resolved correctly through the CDI config generated at boot.
Find my draft PR here: https://github.com/NixOS/nixpkgs/pull/347867 | 07:53:31 |
SomeoneSerge (matrix works sometimes) | * Iterating on triton with ccache is so much faster lmao EDIT: triton+torch in half an hour on a single node, this not perfect but is an improvement | 11:41:55 |
SomeoneSerge (matrix works sometimes) | In reply to @msanft:matrix.org Hey folks! I tried to update libnvidia-container, as it was lacking quite some versions (including security releases) behind. We use it in a work scenario for GPU containers in legacy mode, where we tested it to "work" generally. Only thing that doesn't is the binary resolving (e.g. nvidia-smi, nvidia-persistenced, ...). I just adapted the patches from the old version so that they apply on the new one. I tried dropping the replacement of PATH usage for binary lookup with fixing it to the /run/nvidia-docker directory, as this seems to be an artifact of older times, I believe? At least, the path doesn't exist in a legacy mode container nor on the host. I think the binaries should really be looked up through the PATH, which should be set accordingly when calling nvidia-container-cli? What do the experts think?
CDI containers work, as the binary paths are resolved correctly through the CDI config generated at boot.
Find my draft PR here: https://github.com/NixOS/nixpkgs/pull/347867 What'd be a reasonable way to test this, now that our docker/podman flows all migrated to CDI and our singularity IIRC uses a plain text file with the library paths? | 11:44:30 |
Moritz Sanft | I tested it with an "OCI Hook", like so:
https://github.com/confidential-containers/cloud-api-adaptor/blob/191ec51f6245a1a475c15312d354efaf07ff64de/src/cloud-api-adaptor/podvm/addons/nvidia_gpu/setup.sh#L11C1-L17C4
Getting that to work was also the particular reason for why I got to update this package in the first place. | 12:21:24 |