| 2 Nov 2024 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org I just raised an issue earlier today about some of the driver hashes missing for some of the releases, it feels to me like we really need a solid cuda json scraper to prefetch script Yes, there's https://github.com/ConnorBaker/cuda-redist-find-features/ but afaik Connor's developing this alone, and there stuff to be improved with the way we parse the results on nixpkgs side too | 23:58:38 |
| 3 Nov 2024 |
SomeoneSerge (back on matrix) | In reply to @sielicki:matrix.org i guess it's just consistently the case that ld.so believes there's a better match than that one. It's not about ld.so in this case, it's that iirc CUDA::cuda_driver in CMake somehow wanted to see the .1 during the build, which I suspect is wrong | 00:01:09 |
sielicki | do you remember what issue/bug it was? | 00:04:40 |
sielicki | In reply to @ss:someonex.net Yes, there's https://github.com/ConnorBaker/cuda-redist-find-features/ but afaik Connor's developing this alone, and there stuff to be improved with the way we parse the results on nixpkgs side too really nice, I was just gonna propose we run wget --recursive on https://developer.download.nvidia.com/compute/cuda/repos/runfile/x86_64/ | 00:16:04 |
sielicki | much prefer the feature extraction stuff, that's wicked | 00:16:23 |
sielicki | one problem with the runfile json is it excludes certain components, eg: nccl | 00:17:14 |
| 4 Nov 2024 |
Snektron | Is there a reason why https://github.com/NixOS/nixpkgs/pull/291471 is not merged? | 13:46:49 |
SomeoneSerge (back on matrix) | In reply to @snektron:matrix.org Is there a reason why https://github.com/NixOS/nixpkgs/pull/291471 is not merged? Stuck on unvendoring qt libraries it seems | 15:58:17 |
connor (burnt/out) (UTC-8) | What are your thoughts on nixGL and nix-gl-host? Specifically as it relates to ensuring people use the same driver version on both NixOS and non-NixOS host machines | 22:43:59 |
SomeoneSerge (back on matrix) | It's a disagreeable opinion but the whole "inspect the system at eval time" business in nixGL makes me want to scream "why". Even the idea of requiring a running nix-daemon in order to load the drivers at the program's startup... I think it's very backwards.
I think nix-gl-host decomposes tasks better, but
- It doesn't handle the "nixos but a different release (libc)" situation (nixGL... well it does, just under some set of assumptions)
- I'd much rather have a mode where nix-gl-host is used to export the
/run/opengl-driver/lib prefix, rather than construct LD_LIBRARY_PATH. ot only because the latter screws up search path priorities, but because the workflow I'd rather see is you run a non-nixos, you drop a systemd unit, and voilla your drivers work exactly the way they do on nixos
| 23:07:53 |
SomeoneSerge (back on matrix) | * It's a disagreeable opinion but the whole "inspect the system at eval time" business in nixGL makes me want to scream "why". Even the idea of requiring a running nix-daemon in order to load the drivers at the program's startup... I think it's very backwards.
I think nix-gl-host decomposes tasks better, but
- It doesn't handle the "nixos but a different release (libc)" situation (nixGL... well it does, just under some set of assumptions)
- I'd much rather have a mode where nix-gl-host is used to export the
/run/opengl-driver/lib prefix, rather than construct LD_LIBRARY_PATH. ot only because the latter screws up search path priorities, but because the workflow I'd rather see is you run a non-nixos, you drop a systemd unit, and voilla your drivers work can be consumed exactly the way they do on nixos
| 23:08:04 |
SomeoneSerge (back on matrix) | FWIW I see no reason not to implement both approaches (host's libs vs. nixpkgs' libs that match the running kernel) in one tool | 23:09:19 |
SomeoneSerge (back on matrix) | That could be just two flags: --drivers-from={HOST|<nix expr or a flake ref>} and --export=/run/opengl-driver/lib as the alternative to --run <CMD> | 23:12:23 |
SomeoneSerge (back on matrix) | (and then of course there's this whole libcapsule business that's waiting to be tried) | 23:13:40 |
SomeoneSerge (back on matrix) | Note there was recently somebody in nix-gl-host issues advertising their Rust rewrite | 23:29:51 |
| 5 Nov 2024 |
aidalgol | Has anyone successfully built blender on unstable with CUDA support enabled recently? | 02:31:45 |
ˈt͡sɛːzaɐ̯ | 🤔 Currently, openusd fails in patch phase. | 08:39:43 |
Jonas Chevalier | In reply to @aidalgol:matrix.org Has anyone successfully built blender on unstable with CUDA support enabled recently? looks like it's broken since end of October: https://hydra.nix-community.org/build/1720981 | 15:13:24 |
Jonas Chevalier | In reply to @ss:someonex.net Note there was recently somebody in nix-gl-host issues advertising their Rust rewrite Did you take a look at https://github.com/NVIDIA/nvidia-container-toolkit already? I think they are battling with similar issues, plus sandboxing on top. It seems like they are using ld.so.cache as a loading mechanism instead of the LD_* env var and that might be more robust? | 15:16:02 |
SomeoneSerge (back on matrix) | In reply to @zimbatm:numtide.com Did you take a look at https://github.com/NVIDIA/nvidia-container-toolkit already? I think they are battling with similar issues, plus sandboxing on top. It seems like they are using ld.so.cache as a loading mechanism instead of the LD_* env var and that might be more robust? Yes that's what we use for docker/podman on nixos | 15:16:44 |
SomeoneSerge (back on matrix) | I'm not sure if it's worth reusing because nixGL and nix-gl-host are solving a more general problem; the ctk just assumes an FHS environment, we have to patch it and we have to patch its outputs to make them usable on nixos | 15:19:32 |
SomeoneSerge (back on matrix) | But yes we should keep in mind the general idea of exporting ld.so.cache. We actually used it at least for some time for the singularity containers | 15:21:25 |
SomeoneSerge (back on matrix) | In reply to @zimbatm:numtide.com looks like it's broken since end of October: https://hydra.nix-community.org/build/1720981 In file included from /build/source/intern/cycles/scene/image_vdb.cpp:5:
/build/source/intern/cycles/scene/../scene/image_vdb.h:12:12: fatal error: nanovdb/util/GridHandle.h: No such file or directory
12 | # include <nanovdb/util/GridHandle.h>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [intern/cycles/scene/CMakeFiles/cycles_scene.dir/build.make:328: intern/cycles/scene/CMakeFiles/cycles_scene.dir/image_vdb.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 56%] Building CXX object source/blender/makesrna/intern/CMakeFiles/bf_rna.dir/rna_access_compare_override.cc.o
In file included from /build/source/intern/cycles/scene/image.cpp:9:
/build/source/intern/cycles/scene/../scene/image_vdb.h:12:12: fatal error: nanovdb/util/GridHandle.h: No such file or directory
12 | # include <nanovdb/util/GridHandle.h>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
Well here's the first offender
| 15:22:59 |
SomeoneSerge (back on matrix) | Ah isn't it nice to have a per-attribute build history | 15:23:36 |
SomeoneSerge (back on matrix) | So odd that wasn't a thing with hercules | 15:24:15 |
| kaya 𖤐 changed their profile picture. | 23:25:06 |
| 6 Nov 2024 |
connor (burnt/out) (UTC-8) | Oh my god another TensorRT release | 17:06:11 |
connor (burnt/out) (UTC-8) | I really gotta finish up stuff I’m working on | 17:06:49 |
| 7 Nov 2024 |
connor (burnt/out) (UTC-8) | wanting to figure out the cost of evaluating a single attribute led me to make https://github.com/ConnorBaker/nix-eval-jobs-python | 06:06:20 |
connor (burnt/out) (UTC-8) | wanting to figure out the impact malloc replacements have on nix eval led me to make https://github.com/ConnorBaker/tune-nix-eval still excited about this one -- plan to add support to optimize zram parameters at some point to see if it's worthwhile to just always disable GC so long as there's RAM (or compressed RAM) available | 06:07:36 |