!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
17 Jun 2024
@grw00:matrix.orggrw00interesting they have nvidia-smi mount when my nix container does not 🤔12:50:32
@9hp71n:matrix.orgghpzin joined the room.13:05:27
@gsaurel:laas.frnim65s joined the room.13:36:09
@ss:someonex.netSomeoneSerge (back on matrix)H'm. Maybe they really don't mount the userspace driver o_0. I suppose images derived from NVC do contain a compat driver, but it's kind of weird of them to expect that14:14:23
@ss:someonex.netSomeoneSerge (back on matrix)You still could use NixGL then14:14:39
@ss:someonex.netSomeoneSerge (back on matrix)NixGL will look at the /proc (I think) and choose the correct linuxPackages14:15:22
@ss:someonex.netSomeoneSerge (back on matrix)I'd suggest get an MWE based on that and also reach out with runpod's support asking why they won't mount a driver compatible with the host's kernel14:16:36
@ss:someonex.netSomeoneSerge (back on matrix)(this conversation has happened here before: neither putting drivers into an image nor mounting the host's drivers is "correct": the driver in the image might not be compatible with the kernel running on the host, and the driver from the host might not be compatible e.g. with the libc in the image, et cetera)14:18:14
@grw00:matrix.orggrw00

great thanks, will try this and get back to you. it's

            Cmd = [ "${inputs.nix-gl-host.defaultPackage.x86_64-linux}/bin/nixglhost" "${my-bin}/bin/executor" ];

?

14:37:50
@grw00:matrix.orggrw00i guess i need to build some matrix of images with compat versions and choose which one based on cuda/kernel version in instance metadata14:39:17
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @grw00:matrix.org
i guess i need to build some matrix of images with compat versions and choose which one based on cuda/kernel version in instance metadata
You can build a single image with NixGL
14:39:55
@ss:someonex.netSomeoneSerge (back on matrix)Note: NixGL and nixglhost are different tools 🙃14:40:03
@ss:someonex.netSomeoneSerge (back on matrix) * You can build a single image with NixGL (and multiple drivers) 14:40:37
@grw00:matrix.orggrw00ah 😓14:41:32
19 Jun 2024
@hexa:lossy.networkhexapython312 default migration has starrted14:04:34
@ss:someonex.netSomeoneSerge (back on matrix)stupid new faiss not building with cuda =\14:19:56
21 Jun 2024
@search-sense:matrix.orgsearch-sense

Hello, NixOS community, I want to install python311Packages.tensorrt

TensorRT> command, and try building this derivation again.
TensorRT> $ nix-store --add-fixed sha256 TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
TensorRT> ***
error: builder for '/nix/store/140c5c8lpa30r3jrxxbw74631831prrw-TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz.drv' failed with exit code 1;

but the cuda is 12.2 on my system, is it compatible?

> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
04:53:46
@search-sense:matrix.orgsearch-sense

Is anyone interested to add latest tensorrt-10.1.0 to NixOS ?

searching for dependencies of /nix/store/gknr686xg6ggafkdfy5323bc7f1m5yf7-tensorrt-10.1.0.27-lib/lib/stubs/libnvinfer_vc_plugin.so
    libstdc++.so.6 -> found: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib
    libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
auto-patchelf: 1 dependencies could not be satisfied
error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
auto-patchelf failed to find all the required dependencies.
Add the missing dependencies to --libs or use `--ignore-missing="foo.so.1 bar.so etc.so"`.
error: builder for '/nix/store/7rqkwg91vnk5d3p4vaym0z0pskkmj4r8-tensorrt-10.1.0.27.drv' failed with exit code 1;
       last 10 log lines:
       >     libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
       > setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
       > searching for dependencies of /nix/store/gknr686xg6ggafkdfy5323bc7f1m5yf7-tensorrt-10.1.0.27-lib/lib/stubs/libnvinfer_vc_plugin.so
       >     libstdc++.so.6 -> found: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib
       >     libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
       > setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
       > auto-patchelf: 1 dependencies could not be satisfied
       > error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
       > auto-patchelf failed to find all the required dependencies.
       > Add the missing dependencies to --libs or use `--ignore-missing="foo.so.1 bar.so etc.so"`.
       For full logs, run 'nix log /nix/store/7rqkwg91vnk5d3p4vaym0z0pskkmj4r8-tensorrt-10.1.0.27.drv'.
07:59:22

Show newer messages


Back to Room ListRoom Version: 9