| 5 Jul 2024 |
Jonas Chevalier | A very specific example is: one customer is using nvcr.io/nvidia/pytorch:23.08-py3 ( CUDA 12.21, cuDNN 8.9.4, Python 3.10, PyTorch 2.1.0 ) and looking to try out Nix to fix their reproducibility issues | 07:29:17 |
SomeoneSerge (back on matrix) | And in this case you'd suggest we provide cudaPackages'.cuda_12_21_cudnn_8_9_4? | 07:44:11 |
SomeoneSerge (back on matrix) | ...instead of referring to the manual and cudaPackages.overrideScope' (...)? | 07:44:49 |
SomeoneSerge (back on matrix) | * ...instead of referring to the manual and cudaPackages.overrideScope' (...) | 07:44:51 |
Jonas Chevalier | I haven't thought about this deeply. One potentiality is to maintain a packageset like cudaPackages.pytorch_23_08 | 07:57:44 |
SomeoneSerge (back on matrix) | I think an out-of-tree collection of buildLayeredImage expressions reproducing nvcr images would make sense | 08:08:38 |
SomeoneSerge (back on matrix) | In-tree, maybe not so much because these sound like finalized compositions of packages | 08:09:10 |
hexa | and also Python 3.10 might be hit or miss these days | 10:52:20 |
SomeoneSerge (back on matrix) | In reply to @hexa:lossy.network and also Python 3.10 might be hit or miss these days I thought we can ignore this without a disclaimer xD | 12:14:24 |
| Sami Liedes joined the room. | 23:03:28 |
| 6 Jul 2024 |
SomeoneSerge (back on matrix) | In reply to @hexa:lossy.network faissWithCuda pls 😄 Oh, broken on x86 64 https://hydra.nix-community.org/build/68172 | 16:53:02 |
hexa | Looks like oom to me | 16:54:55 |
hexa | sigkill | 16:55:19 |
hexa | maybe limit number of parallel ptxas instances? | 16:56:48 |
hexa | Or maybe just an outlier | 16:57:10 |
SomeoneSerge (back on matrix) | Wondering what is the generic way to do that | 17:43:02 |
SomeoneSerge (back on matrix) | from nvcc docs | 17:43:06 |
SomeoneSerge (back on matrix) |  Download clipboard.png | 17:43:13 |
hexa | https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#nvcc-environment-variables | 18:09:41 |
hexa | can also be passed via an env var | 18:09:50 |
SomeoneSerge (back on matrix) | In reply to @hexa:lossy.network https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#nvcc-environment-variables Yes, we use that in setupCudaHook | 18:24:20 |
SomeoneSerge (back on matrix) | In a limited way... | 18:24:31 |
connor (burnt/out) (UTC-8) | Oh that reminds me: At least one Python package with CUDA support in Nixpkgs invokes NVCC from within Python scripts as part of its setup, and so our NVCC_PREPEND_FLAGS (or whatever the name is) in our setup hook is ignored
Don’t think that’s relevant but just something irritating I found some number of weeks ago (I believe I patched it in the PR I have to update the CUDA packaging) | 23:36:00 |
| 7 Jul 2024 |
| ornx joined the room. | 19:45:55 |
ornx | this doesn't work (nvidia-smi says my card is there etc). am i holding it wrong or is it broken?
$ nix develop
[snip]
$ nvcc foo.cu
In file included from /nix/store/fydjj6z3nyi1ywqbzzw7ai12ncjx9kwy-cuda-merged-12.2/include/cuda_runtime.h:82,
from <command-line>:
/nix/store/fydjj6z3nyi1ywqbzzw7ai12ncjx9kwy-cuda-merged-12.2/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
143 | #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
| ^~~~~
flake.nix i am using as shell:
{
description = "cuda development environment";
inputs = {
nixpkgs = {
url = "github:NixOS/nixpkgs/4284c2b73c8bce4b46a6adf23e16d9e2ec8da4bb";
};
};
outputs = { self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = import nixpkgs {
inherit system;
config.allowUnfree = true;
config.cudaSupport = true;
};
in {
devShells.${system}.default = pkgs.mkShell {
buildInputs = with pkgs; [
cudatoolkit linuxPackages.nvidia_x11
cudaPackages.cudnn
libGLU libGL
xorg.libXi xorg.libXmu freeglut
xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
ncurses5 stdenv.cc binutils
];
shellHook = ''
export LD_LIBRARY_PATH="${pkgs.linuxPackages.nvidia_x11}/lib"
'';
};
};
}
| 19:47:46 |
aidalgol | Maybe leave out stdenv.cc from the mkShell inputs? | 19:50:57 |
ornx | no such luck, same error | 19:53:36 |
ornx | even NIXPKGS_ALLOW_UNFREE=1 nix-shell -p cudaPackages_12.cudatoolkit gives me that error | 19:54:13 |
ornx | although i'm not sure what commit that's on | 19:54:29 |
ornx | $ nix-shell -p gcc12 --run 'gcc --version'
gcc (GCC) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
| 19:55:37 |