!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

289 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
5 Jul 2024
@zimbatm:numtide.comJonas ChevalierA very specific example is: one customer is using nvcr.io/nvidia/pytorch:23.08-py3 ( CUDA 12.21, cuDNN 8.9.4, Python 3.10, PyTorch 2.1.0 ) and looking to try out Nix to fix their reproducibility issues07:29:17
@ss:someonex.netSomeoneSerge (back on matrix) And in this case you'd suggest we provide cudaPackages'.cuda_12_21_cudnn_8_9_4? 07:44:11
@ss:someonex.netSomeoneSerge (back on matrix) ...instead of referring to the manual and cudaPackages.overrideScope' (...)? 07:44:49
@ss:someonex.netSomeoneSerge (back on matrix) * ...instead of referring to the manual and cudaPackages.overrideScope' (...) 07:44:51
@zimbatm:numtide.comJonas Chevalier I haven't thought about this deeply. One potentiality is to maintain a packageset like cudaPackages.pytorch_23_08 07:57:44
@ss:someonex.netSomeoneSerge (back on matrix) I think an out-of-tree collection of buildLayeredImage expressions reproducing nvcr images would make sense 08:08:38
@ss:someonex.netSomeoneSerge (back on matrix)In-tree, maybe not so much because these sound like finalized compositions of packages08:09:10
@hexa:lossy.networkhexaand also Python 3.10 might be hit or miss these days10:52:20
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @hexa:lossy.network
and also Python 3.10 might be hit or miss these days
I thought we can ignore this without a disclaimer xD
12:14:24
@sliedes:hacklab.fiSami Liedes joined the room.23:03:28
6 Jul 2024
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @hexa:lossy.network
faissWithCuda pls 😄
Oh, broken on x86 64 https://hydra.nix-community.org/build/68172
16:53:02
@hexa:lossy.networkhexaLooks like oom to me16:54:55
@hexa:lossy.networkhexasigkill16:55:19
@hexa:lossy.networkhexamaybe limit number of parallel ptxas instances?16:56:48
@hexa:lossy.networkhexaOr maybe just an outlier16:57:10
@ss:someonex.netSomeoneSerge (back on matrix)Wondering what is the generic way to do that17:43:02
@ss:someonex.netSomeoneSerge (back on matrix)from nvcc docs17:43:06
@ss:someonex.netSomeoneSerge (back on matrix)clipboard.png
Download clipboard.png
17:43:13
@hexa:lossy.networkhexahttps://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#nvcc-environment-variables18:09:41
@hexa:lossy.networkhexacan also be passed via an env var18:09:50
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @hexa:lossy.network
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#nvcc-environment-variables
Yes, we use that in setupCudaHook
18:24:20
@ss:someonex.netSomeoneSerge (back on matrix)In a limited way...18:24:31
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Oh that reminds me: At least one Python package with CUDA support in Nixpkgs invokes NVCC from within Python scripts as part of its setup, and so our NVCC_PREPEND_FLAGS (or whatever the name is) in our setup hook is ignored
Don’t think that’s relevant but just something irritating I found some number of weeks ago (I believe I patched it in the PR I have to update the CUDA packaging)
23:36:00
7 Jul 2024
@ornx:littledevil.clubornx joined the room.19:45:55
@ornx:littledevil.clubornx

this doesn't work (nvidia-smi says my card is there etc). am i holding it wrong or is it broken?

$ nix develop
[snip]
$ nvcc foo.cu
In file included from /nix/store/fydjj6z3nyi1ywqbzzw7ai12ncjx9kwy-cuda-merged-12.2/include/cuda_runtime.h:82,
                 from <command-line>:
/nix/store/fydjj6z3nyi1ywqbzzw7ai12ncjx9kwy-cuda-merged-12.2/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  143 | #error -- unsupported GNU version! gcc versions later than 12 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
      |  ^~~~~

flake.nix i am using as shell:

{
  description = "cuda development environment";
  inputs = {
    nixpkgs = {
      url = "github:NixOS/nixpkgs/4284c2b73c8bce4b46a6adf23e16d9e2ec8da4bb";
    };
  };
  outputs = { self, nixpkgs }:
    let
      system = "x86_64-linux";
      pkgs = import nixpkgs {
        inherit system;
        config.allowUnfree = true;
        config.cudaSupport = true;
      };
    in {
      devShells.${system}.default = pkgs.mkShell {
        buildInputs = with pkgs; [
          cudatoolkit linuxPackages.nvidia_x11
          cudaPackages.cudnn
          libGLU libGL
          xorg.libXi xorg.libXmu freeglut
          xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib 
          ncurses5 stdenv.cc binutils
        ];

        shellHook = ''
              export LD_LIBRARY_PATH="${pkgs.linuxPackages.nvidia_x11}/lib"
          '';          
      };
    };
}
19:47:46
@aidalgol:matrix.orgaidalgol Maybe leave out stdenv.cc from the mkShell inputs? 19:50:57
@ornx:littledevil.clubornxno such luck, same error19:53:36
@ornx:littledevil.clubornx even NIXPKGS_ALLOW_UNFREE=1 nix-shell -p cudaPackages_12.cudatoolkit gives me that error 19:54:13
@ornx:littledevil.clubornxalthough i'm not sure what commit that's on19:54:29
@ornx:littledevil.clubornx
$ nix-shell -p gcc12 --run 'gcc --version'
gcc (GCC) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
19:55:37

Show newer messages


Back to Room ListRoom Version: 9