| 19 Nov 2024 |
hexa | sowwy | 00:31:40 |
connor (he/him) | In reply to @ss:someonex.net Should just work, what is the error? Curl threw connection refused or something similar; I’ll try to get the log tomorrow | 06:34:11 |
| 20 Nov 2024 |
| Conroy joined the room. | 04:47:44 |
connor (he/him) | I did not get a chance; rip | 07:22:37 |
| Daniel joined the room. | 18:53:01 |
| 22 Nov 2024 |
| deng23fdsafgea joined the room. | 06:27:37 |
| Morgan (@numinit) joined the room. | 17:52:10 |
| 24 Nov 2024 |
sielicki | https://negativo17.org/nvidia-driver/ pretty good read | 21:49:05 |
sielicki | most of this is stuff that nixos gets right, but it's a nice collection of gotchas and solutions | 22:01:49 |
sielicki | anyone have strong opinions on moving nccl and nccl-tests out of cudaModules? Rationale on moving them out: neither one is distributed as a part of the cuda toolkit and they release on an entirely separate cadence, so there's no real reason for it to be in there. It's no different than eg: torch in terms of the cuda dependency. | 22:16:05 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org anyone have strong opinions on moving nccl and nccl-tests out of cudaModules? Rationale on moving them out: neither one is distributed as a part of the cuda toolkit and they release on an entirely separate cadence, so there's no real reason for it to be in there. It's no different than eg: torch in terms of the cuda dependency. iirc we put it in there because if you set tensorflow = ...callPackage ... { cudaPackages = cudaPackages_XX_y; } you'll need to also pass a compatible nccl | 22:17:33 |
SomeoneSerge (back on matrix) | so it's just easier to instantiate each cudaPackages variant with its own nccl and pass it along | 22:17:55 |
sielicki | I guess that's fair, and there is a pretty strong coupling of cuda versions and nccl versions... eg: https://github.com/pytorch/pytorch/pull/133593 has been stalled for some time due to nvidia dropping the pypi cu11 package for nccl, so there's reason to keep them consistent even if they technically release separately. | 22:20:12 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org https://negativo17.org/nvidia-driver/ pretty good read Any highlights, what we might be missing? | 22:22:09 |
sielicki | honestly I am not sure there's anything, I just like the thought that went into it | 22:27:21 |
sielicki | the special softdep for nvidia-uvm etc | 22:27:48 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org the special softdep for nvidia-uvm etc yeah we have that, and iirc a special-case for the datacenter driver where it's not a softdep anymore | 22:28:24 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org the special softdep for nvidia-uvm etc * yeah we have that, and iirc a special-case for the datacenter driver where it's not a softdep anymore (not sure what the exact situation is) | 22:29:12 |
| 25 Nov 2024 |
sielicki | is this useful? https://gist.github.com/sielicki/2601de3ad8d8c732af80b12e36d326aa | 04:31:08 |
sielicki | example of its output: https://gist.github.com/sielicki/2601de3ad8d8c732af80b12e36d326aa/24c08bb29f1397c7d006b01f7afddd5cb06e90a5 | 04:31:38 |
connor (he/him) | You can see what I eventually hope to move in-tree here: https://github.com/ConnorBaker/cuda-packages
Here’s the update script I’ve made for the different redists: https://github.com/ConnorBaker/cuda-packages/tree/main/scripts/cuda-redist | 07:01:12 |