| 2 Nov 2024 |
sielicki | this is his first and only post on the cmake forums and it's <3'd by rob maynard who works at nvidia | 23:27:24 |
sielicki | i just kinda don't get the motivation, not really clear why the FHS steam shrinking needed this | 23:33:25 |
sielicki | but who knows | 23:33:40 |
sielicki | as long as i'm stuck on my computer on a saturday, qq: is there a strong reason to hold cudart at 12.4 in master, or just nobody raised a PR for it? | 23:39:13 |
sielicki | https://github.com/NixOS/nixpkgs/pull/322075 looks like he just never got back to handling review comments :\ | 23:40:51 |
SomeoneSerge (back on matrix) | ❯ readelf -d result-stubs/lib/stubs/libcuda.so
...
0x000000000000000e (SONAME) Library soname: [libcuda.so.1]
...
❯ readelf -d /run/opengl-driver/lib/libcuda.so
...
0x000000000000000e (SONAME) Library soname: [libcuda.so.1]
| 23:40:52 |
SomeoneSerge (back on matrix) | I think the last comment in that discourse might be misleading | 23:41:13 |
sielicki | maybe I did a poor job of summarizing it here in riot but that's what I expected to see | 23:42:13 |
SomeoneSerge (back on matrix) | Nvidia ships stubs without the .1s which is why we have https://github.com/NixOS/nixpkgs/blob/a8ffc2295c358629bc1bda569bf8b3bbb21aa1be/pkgs/development/cuda-modules/cuda/overrides.nix#L124-L129 | 23:42:54 |
sielicki | The problem I'm wondering about is what actually enforces that ld.so prefers /run/opengl-driver/lib to /usr/local/nvidia/lib64/stubs? or potentially someone's conda env or virtualenv | 23:42:59 |
sielicki | with RPATH'ing all the things, it's probably fine | 23:43:31 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org The problem I'm wondering about is what actually enforces that ld.so prefers /run/opengl-driver/lib to /usr/local/nvidia/lib64/stubs? or potentially someone's conda env or virtualenv That executables from Nixpkgs use their own ld.so which ignores /usr stuff | 23:43:42 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org as long as i'm stuck on my computer on a saturday, qq: is there a strong reason to hold cudart at 12.4 in master, or just nobody raised a PR for it? Yes it's just that it's toil | 23:44:40 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org as long as i'm stuck on my computer on a saturday, qq: is there a strong reason to hold cudart at 12.4 in master, or just nobody raised a PR for it? * Yes it's just that it's toil (and review roundtrip times aren't helping) | 23:45:28 |
sielicki | let me know if I can pick up any slack or what you guys need | 23:46:03 |
SomeoneSerge (back on matrix) | There's lots and the linked PR is one candidate 😍 | 23:46:50 |
sielicki | I just raised an issue earlier today about some of the driver hashes missing for some of the releases, it feels to me like we really need a solid cuda json scraper to prefetch thing | 23:47:14 |
sielicki | * I just raised an issue earlier today about some of the driver hashes missing for some of the releases, it feels to me like we really need a solid cuda json scraper to prefetch script | 23:47:33 |
SomeoneSerge (back on matrix) | In reply to @ss:someonex.net Nvidia ships stubs without the .1s which is why we have https://github.com/NixOS/nixpkgs/blob/a8ffc2295c358629bc1bda569bf8b3bbb21aa1be/pkgs/development/cuda-modules/cuda/overrides.nix#L124-L129 I'm wondering what can we do to remove this hack | 23:53:53 |
sielicki | I have no idea how it works in the first place, let alone how to remove it | 23:56:04 |
sielicki | i guess it just consistently is the case that there's a better path than that one | 23:57:02 |
sielicki | * i guess it's just consistently the case that ld.so believes there's a better match than that one. | 23:57:48 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org I just raised an issue earlier today about some of the driver hashes missing for some of the releases, it feels to me like we really need a solid cuda json scraper to prefetch script Yes, there's https://github.com/ConnorBaker/cuda-redist-find-features/ but afaik Connor's developing this alone, and there stuff to be improved with the way we parse the results on nixpkgs side too | 23:58:38 |
| 3 Nov 2024 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org i guess it's just consistently the case that ld.so believes there's a better match than that one. It's not about ld.so in this case, it's that iirc CUDA::cuda_driver in CMake somehow wanted to see the .1 during the build, which I suspect is wrong | 00:01:09 |
sielicki | do you remember what issue/bug it was? | 00:04:40 |
sielicki | In reply to @ss:someonex.net Yes, there's https://github.com/ConnorBaker/cuda-redist-find-features/ but afaik Connor's developing this alone, and there stuff to be improved with the way we parse the results on nixpkgs side too really nice, I was just gonna propose we run wget --recursive on https://developer.download.nvidia.com/compute/cuda/repos/runfile/x86_64/ | 00:16:04 |
sielicki | much prefer the feature extraction stuff, that's wicked | 00:16:23 |
sielicki | one problem with the runfile json is it excludes certain components, eg: nccl | 00:17:14 |
| 4 Nov 2024 |
Snektron | Is there a reason why https://github.com/NixOS/nixpkgs/pull/291471 is not merged? | 13:46:49 |
SomeoneSerge (back on matrix) | In reply to @snektron:matrix.org Is there a reason why https://github.com/NixOS/nixpkgs/pull/291471 is not merged? Stuck on unvendoring qt libraries it seems | 15:58:17 |