| 17 Jun 2024 |
SomeoneSerge (back on matrix) | I'd suggest get an MWE based on that and also reach out with runpod's support asking why they won't mount a driver compatible with the host's kernel | 14:16:36 |
SomeoneSerge (back on matrix) | (this conversation has happened here before: neither putting drivers into an image nor mounting the host's drivers is "correct": the driver in the image might not be compatible with the kernel running on the host, and the driver from the host might not be compatible e.g. with the libc in the image, et cetera) | 14:18:14 |
grw00 | great thanks, will try this and get back to you. it's
Cmd = [ "${inputs.nix-gl-host.defaultPackage.x86_64-linux}/bin/nixglhost" "${my-bin}/bin/executor" ];
?
| 14:37:50 |
grw00 | i guess i need to build some matrix of images with compat versions and choose which one based on cuda/kernel version in instance metadata | 14:39:17 |
SomeoneSerge (back on matrix) | In reply to @grw00:matrix.org i guess i need to build some matrix of images with compat versions and choose which one based on cuda/kernel version in instance metadata You can build a single image with NixGL | 14:39:55 |
SomeoneSerge (back on matrix) | Note: NixGL and nixglhost are different tools 🙃 | 14:40:03 |
SomeoneSerge (back on matrix) | * You can build a single image with NixGL (and multiple drivers) | 14:40:37 |
grw00 | ah 😓 | 14:41:32 |
| 19 Jun 2024 |
hexa | python312 default migration has starrted | 14:04:34 |
SomeoneSerge (back on matrix) | stupid new faiss not building with cuda =\ | 14:19:56 |
| 21 Jun 2024 |
search-sense | Hello, NixOS community, I want to install python311Packages.tensorrt
TensorRT> command, and try building this derivation again.
TensorRT> $ nix-store --add-fixed sha256 TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
TensorRT> ***
error: builder for '/nix/store/140c5c8lpa30r3jrxxbw74631831prrw-TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz.drv' failed with exit code 1;
but the cuda is 12.2 on my system, is it compatible?
> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
| 04:53:46 |
search-sense | Is anyone interested to add latest tensorrt-10.1.0 to NixOS ?
searching for dependencies of /nix/store/gknr686xg6ggafkdfy5323bc7f1m5yf7-tensorrt-10.1.0.27-lib/lib/stubs/libnvinfer_vc_plugin.so
libstdc++.so.6 -> found: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib
libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
auto-patchelf: 1 dependencies could not be satisfied
error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
auto-patchelf failed to find all the required dependencies.
Add the missing dependencies to --libs or use `--ignore-missing="foo.so.1 bar.so etc.so"`.
error: builder for '/nix/store/7rqkwg91vnk5d3p4vaym0z0pskkmj4r8-tensorrt-10.1.0.27.drv' failed with exit code 1;
last 10 log lines:
> libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
> setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
> searching for dependencies of /nix/store/gknr686xg6ggafkdfy5323bc7f1m5yf7-tensorrt-10.1.0.27-lib/lib/stubs/libnvinfer_vc_plugin.so
> libstdc++.so.6 -> found: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib
> libgcc_s.so.1 -> found: /nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib
> setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
> auto-patchelf: 1 dependencies could not be satisfied
> error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
> auto-patchelf failed to find all the required dependencies.
> Add the missing dependencies to --libs or use `--ignore-missing="foo.so.1 bar.so etc.so"`.
For full logs, run 'nix log /nix/store/7rqkwg91vnk5d3p4vaym0z0pskkmj4r8-tensorrt-10.1.0.27.drv'.
| 07:59:22 |
search-sense | export NIXPKGS_ALLOW_UNFREE=1 && nix-build -A cudaPackages.tensorrt
> setting RPATH to: /nix/store/bn7pnigb0f8874m6riiw6dngsmdyic1g-gcc-13.3.0-lib/lib:/nix/store/pd8xxiyn2xi21fgg9qm7r0qghsk8715k-gcc-13.3.0-libgcc/lib:$ORIGIN
> auto-patchelf: 1 dependencies could not be satisfied
> error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
> auto-patchelf failed to find all the required dependencies.
> Add the missing dependencies to --libs or use `--ignore-missing="foo.so.1 bar.so etc.so"`.
| 11:03:14 |
SomeoneSerge (back on matrix) | In reply to @search-sense:matrix.org
Hello, NixOS community, I want to install python311Packages.tensorrt
TensorRT> command, and try building this derivation again.
TensorRT> $ nix-store --add-fixed sha256 TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
TensorRT> ***
error: builder for '/nix/store/140c5c8lpa30r3jrxxbw74631831prrw-TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz.drv' failed with exit code 1;
but the cuda is 12.2 on my system, is it compatible?
> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
You can use cudaPackages.overrideScope to plug in the trt release compatible with your cuda, but also I think trt was originally introduced in Nixpkgs with a logic to select the compatible release in each cuda package set automatically. Evidently, that must be have broken | 15:51:43 |
SomeoneSerge (back on matrix) | In reply to @ss:someonex.net Nvidia prevents unattended downloads, of course it broke ...primarily because of^^^ and because no one seems to be actively using Nixpkgs' in-tree trt expression? | 15:52:47 |
| Lucas joined the room. | 17:13:01 |
Lucas | Does anyone have nsight_systems working?
I am using CUDA to develop progrmans on NixOS 24.05 and it is working great. Now I want to profile my code. Using the following flake
``` I was able to
get `ncu` working
| 19:46:16 |
Lucas | * Does anyone have nsight_systems working?
I am using CUDA to develop progrmans on NixOS 24.05 and it is working great. Now I want to profile my code. Using the following flake
{
description = "nsight_systems";
inputs = {
# nixpkgs.url = "github:NixOS/nixpkgs/release-24.05";
# nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
nixpkgs.url = "github:ConnorBaker/nixpkgs/feat/cudaPackages-fixed-output-derivations";
};
outputs = { self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = import nixpkgs { system = system; config.allowUnfree = true; };
in
{
devShells.${system}.default = pkgs.mkShell {
nativeBuildInputs = [
pkgs.cudaPackages.nsight_systems
pkgs.cudaPackages.nsight_compute
];
};
};
}
```
I was able to get `ncu` working.
| 19:47:05 |
Lucas | * Does anyone have nsight_systems working?
I am using CUDA to develop progrmans on NixOS 24.05 and it is working great. Now I want to profile my code. Using the following flake
{
description = "nsight_systems";
inputs = {
# nixpkgs.url = "github:NixOS/nixpkgs/release-24.05";
# nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
nixpkgs.url = "github:ConnorBaker/nixpkgs/feat/cudaPackages-fixed-output-derivations";
};
outputs = { self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = import nixpkgs { system = system; config.allowUnfree = true; };
in
{
devShells.${system}.default = pkgs.mkShell {
nativeBuildInputs = [
pkgs.cudaPackages.nsight_systems
pkgs.cudaPackages.nsight_compute
];
};
};
}
I was able to get ncu working.
| 19:47:39 |
Lucas | But when I try to run nsys-ui I get a dialogue box with the error message
Failed to load plugin: QuadDPlugin
Cannot load library /nix/store/hzp2wmqbqihx4slp353ixs405ry6li4f-cuda12.5-nsight_systems-2024.2.3.38-bin/nsight-systems/2024.2.3/host-linux-x64/Plugins/QuadDPlugin/libQuadDPlugin.so: /nix/store/hzp2wmqbqihx4slp353ixs405ry6li4f-cuda12.5-nsight_systems-2024.2.3.38-bin/nsight-systems/2024.2.3/host-linux-x64/Plugins/QuadDPlugin/libQuadDPlugin.so: undefined symbol: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev, version Qt_6
Some functionality will be disabled
| 19:50:21 |
Lucas | I do have to launch ncu and ncu-ui via their full paths /nix/store/0v8ydp1hq7ynncwir4hv5hkpna629iw0-cuda12.5-nsight_compute-2024.2.0.16/nsight-compute/2024.2.0/ncu and /nix/store/0v8ydp1hq7ynncwir4hv5hkpna629iw0-cuda12.5-nsight_compute-2024.2.0.16/nsight-compute/2024.2.0/ncu-ui, respectively. | 19:52:26 |
Lucas | * Does anyone have nsight_systems working?
I am using CUDA to develop programs on NixOS 24.05 and it is working great. Now I want to profile my code. Using the following flake
{
description = "nsight_systems";
inputs = {
# nixpkgs.url = "github:NixOS/nixpkgs/release-24.05";
# nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
nixpkgs.url = "github:ConnorBaker/nixpkgs/feat/cudaPackages-fixed-output-derivations";
};
outputs = { self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = import nixpkgs { system = system; config.allowUnfree = true; };
in
{
devShells.${system}.default = pkgs.mkShell {
nativeBuildInputs = [
pkgs.cudaPackages.nsight_systems
pkgs.cudaPackages.nsight_compute
];
};
};
}
I was able to get ncu working.
| 20:02:11 |
Gaétan Lepage | connor (he/him) (UTC-5) could you please give a shot at
nix build github:GaetanLepage/nixpkgs/jaxlib#python311Packages.jaxlib
| 20:48:30 |
Gaétan Lepage | * connor (he/him) (UTC-5) could you please give a shot at the following ?
nix build github:GaetanLepage/nixpkgs/jaxlib#python311Packages.jaxlib
| 20:48:48 |
Gaétan Lepage | * connor (he/him) (UTC-5) could you please give a shot at the following ?
nix build github:GaetanLepage/nixpkgs/jaxlib#python311Packages.jaxlibWithCuda
| 20:48:50 |
aidalgol | In reply to @ss:someonex.net ...primarily because of^^^ and because no one seems to be actively using Nixpkgs' in-tree trt expression? Sorry, I have not been using TensorRT in a while, so I'm not catching these, even though I put it in nixpkgs to begin with. :S | 21:19:55 |
Lucas | In reply to @lcw:matrix.org
Does anyone have nsight_systems working?
I am using CUDA to develop programs on NixOS 24.05 and it is working great. Now I want to profile my code. Using the following flake
{
description = "nsight_systems";
inputs = {
# nixpkgs.url = "github:NixOS/nixpkgs/release-24.05";
# nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
nixpkgs.url = "github:ConnorBaker/nixpkgs/feat/cudaPackages-fixed-output-derivations";
};
outputs = { self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = import nixpkgs { system = system; config.allowUnfree = true; };
in
{
devShells.${system}.default = pkgs.mkShell {
nativeBuildInputs = [
pkgs.cudaPackages.nsight_systems
pkgs.cudaPackages.nsight_compute
];
};
};
}
I was able to get ncu working.
I was able to get nsight_systems working from nixpkgs.url = "github:mcwitt/nixpkgs/fix/nsight_systems";. | 21:26:13 |
SomeoneSerge (back on matrix) | In reply to @lcw:matrix.org
But when I try to run nsys-ui I get a dialogue box with the error message
Failed to load plugin: QuadDPlugin
Cannot load library /nix/store/hzp2wmqbqihx4slp353ixs405ry6li4f-cuda12.5-nsight_systems-2024.2.3.38-bin/nsight-systems/2024.2.3/host-linux-x64/Plugins/QuadDPlugin/libQuadDPlugin.so: /nix/store/hzp2wmqbqihx4slp353ixs405ry6li4f-cuda12.5-nsight_systems-2024.2.3.38-bin/nsight-systems/2024.2.3/host-linux-x64/Plugins/QuadDPlugin/libQuadDPlugin.so: undefined symbol: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev, version Qt_6
Some functionality will be disabled
Looks likw a lwftover vendored qt library that we oughtta relink | 21:45:46 |
Lucas | In reply to @ss:someonex.net Looks likw a lwftover vendored qt library that we oughtta relink Oh cool. Is there an example of relinking the libraries that I can follow? Is it this https://github.com/ConnorBaker/nixpkgs/blob/9ee229fe705580b62fc9011f5d8cc78e87f85971/pkgs/development/cuda-modules/overrides/cuda/nsight_systems.nix#L102-L121 ? | 22:19:56 |
| 22 Jun 2024 |
search-sense | In reply to @ss:someonex.net ...primarily because of^^^ and because no one seems to be actively using Nixpkgs' in-tree trt expression? the essence of the problem is this:
> error: auto-patchelf could not satisfy dependency libcudart.so.12 wanted by /nix/store/799sv915xqi5b8n14hdkbbp6h06rrjz7-tensorrt-10.1.0.27-bin/bin/trtexec
> auto-patchelf failed to find all the required dependencies.```
| 16:12:11 |