!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

310 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda60 Servers

Load older messages


SenderMessageTime
27 Jun 2024
@ss:someonex.netSomeoneSerge (matrix works sometimes) * They don't seem to specify any constraints: https://github.com/archibate/OpenSYCL/blob/b919667ea53f99dbc55a9832f297cf0cb689034e/cmake/FindCUDA.cmake#L31 (oh, this is some fork) 11:02:44
@coruscate:matrix.orgcoruscatemy issue seems to be the packaged clang version in the nixpkg opensycl package, i'll probably simply repackage it.11:16:02
@matthewcroughan:defenestrate.itmatthewcroughan Does anybody get cicc died due to signal 9 (Kill signal) 12:19:18
@matthewcroughan:defenestrate.itmatthewcroughanWhen trying to build onnxruntime with cuda support?12:19:24
@matthewcroughan:defenestrate.itmatthewcroughan
In reply to @connorbaker:matrix.org
I remember enabling ZRAM partly because of that. And I’m pretty sure they were all zero pages too, because they compressed to absolutely nothing lmao
Same.. That sounds very familiar
12:21:21
@matthewcroughan:defenestrate.itmatthewcroughanhttps://github.com/rapidsai/cudf/issues/801812:21:36
@matthewcroughan:defenestrate.itmatthewcroughanseems like the solution to cicc getting killed is to limit parallelism also 12:21:48
@coruscate:matrix.orgcoruscate
In reply to @ss:someonex.net
Just use cudaPackages_11_5. Start ad hoc, building only opensycl against it. If you wish to rebuild the whole package set, use an overlay.

I seem to be too dumb to figure this out.

I use this flake:

{
  description = "Needleman Wunsch Sycl implementation";
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs";
    systems.url = "github:nix-systems/default";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs =
    { self
    , systems
    , nixpkgs
    , flake-utils
    }:
    flake-utils.lib.eachDefaultSystem
      (system:
      let
        pkgs = (import nixpkgs {
          system = system;
          config = {
            cudaPackages = pkgs.cudaPackages_11_5;
            cudaForwardCompat = true;
            cudaCapabilities = [ "7.5" ];
            cudaSupport = true;
            allowUnfree = true; # Enable unfree software
          };
        });
      in
      {
        # overlay = overlay;
        devShells.default = import ./shell.nix { inherit pkgs; };
        packages = {
          default = pkgs.callPackage ./package.nix { inherit pkgs; };
          sycl = pkgs.callPackage ./opensycl.nix { inherit pkgs; };
        };
      });
}

and expect both the shell and the sycl package to use the correct version, this clearly is not the case though. I expect that I can set it the way the documentation leads me to believe for the sycl package, as I use the callPackage function, but how would I do the same for the shell?

{ pkgs ? import <nixpkgs> }:

with pkgs;
mkShell {
  buildInputs = [
    (callPackage ./opensycl.nix { })
    gnumake
    gdb

    lld
    cudaPackages.cudatoolkit
    linuxPackages.nvidia_x11
    opencl-headers
    pkg-config
  ];
...

sorry if this is too incoherent or stupid to begin with

13:17:07
@coruscate:matrix.orgcoruscate
In reply to @ss:someonex.net
Just use cudaPackages_11_5. Start ad hoc, building only opensycl against it. If you wish to rebuild the whole package set, use an overlay.
*

I seem to be too dumb to figure this out.

I use this flake:

{
  description = "Needleman Wunsch Sycl implementation";
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs";
    systems.url = "github:nix-systems/default";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs =
    { self
    , systems
    , nixpkgs
    , flake-utils
    }:
    flake-utils.lib.eachDefaultSystem
      (system:
      let
        pkgs = (import nixpkgs {
          system = system;
          config = {
            cudaPackages = pkgs.cudaPackages_11_5;
            cudaForwardCompat = true;
            cudaCapabilities = [ "7.5" ];
            cudaSupport = true;
            allowUnfree = true; # Enable unfree software
          };
        });
      in
      {
        # overlay = overlay;
        devShells.default = import ./shell.nix { inherit pkgs; };
        packages = {
          default = pkgs.callPackage ./package.nix { inherit pkgs; };
          sycl = pkgs.callPackage ./opensycl.nix { inherit pkgs; };
        };
      });
}

and expect both the shell and the sycl package to use the correct version, this clearly is not the case though. I expect that I can set it the way the documentation leads me to believe for the sycl package, as I use the callPackage function, but how would I do the same for the shell?

{ pkgs ? import <nixpkgs> }:

with pkgs;
mkShell {
  buildInputs = [
    (callPackage ./opensycl.nix { })
    gnumake
    gdb

    lld
    cudaPackages.cudatoolkit
    linuxPackages.nvidia_x11
    opencl-headers
    pkg-config
  ];
...

sorry if this is too incoherent or stupid to begin with

13:17:20
@ss:someonex.netSomeoneSerge (matrix works sometimes) There's no config.cudaPackages option; cudaPackages is an attribute in an evaluated pkgs instance; it can be overridden using overlays 13:41:48
@ss:someonex.netSomeoneSerge (matrix works sometimes) coruscate is there a public repo with the flake and the opensycl.nix? 13:42:37
@ss:someonex.netSomeoneSerge (matrix works sometimes) Looking at how setup-cuda-hook.sh propagates itself and once again I cannot remember where the extra offset of (0, 1) is coming from... 21:28:55
@ss:someonex.netSomeoneSerge (matrix works sometimes) Say someone puts a cuda_cudart in buildInputs (that's (0,1)), it has the hook in propagatedNativeBuildInputs (that's (-1, 0), right?). The expectation (confirmed by a successful build is that we arrive at (-1, 0)=nativeBuildInputs again, but the arithmetics says (0, 0) 21:31:06
@ss:someonex.netSomeoneSerge (matrix works sometimes) * Looking at how setup-cuda-hook.sh propagates itself and once again I cannot remember where the extra offset of (1, 0) is coming from... 21:31:28
@ss:someonex.netSomeoneSerge (matrix works sometimes) Is it added manually? Hmm now come to think of it if a is in buildInputs, and has b in propagatedBuildInputs, which has c in propagatedBuildINputs - c should end up at the same offsets, i.e. in buildInputs 21:32:31
@ss:someonex.netSomeoneSerge (matrix works sometimes) * Is it added manually? Hmm now come to think of it if a is in buildInputs, and has b in propagatedBuildInputs, which has c in propagatedBuildINputs - we want c to end up at the same offsets, i.e. in buildInputs 21:32:46
28 Jun 2024
@ss:someonex.netSomeoneSerge (matrix works sometimes) Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bump a nixpgks-review? xD
01:43:56
@ss:someonex.netSomeoneSerge (matrix works sometimes) * Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD
01:44:00
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @ss:someonex.net
Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD
Omg, nevermind... I checked that magma still builds after the first commit, then did something in the second and now it doesn't
01:49:25
@howird:matrix.orgHoward Nguyen-Huu joined the room.02:44:51
@search-sense:matrix.orgsearch-sense
In reply to @matthewcroughan:defenestrate.it
they removed the .6 from the release
I know, that it's broken ... actually it would be good if someone upgrade it to the current version TensorRT-10.1.0.27
11:00:16
@ss:someonex.netSomeoneSerge (matrix works sometimes) Shoot, I think propagatedBuildOutputs are broken with __structuredAttrs 11:08:29
@ss:someonex.netSomeoneSerge (matrix works sometimes) The hook loops over $propagatedBuildOutputs but __structuredAttrs make it onti an array, so the first expression resolves into the value of the first element 🤡 11:09:03
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @search-sense:matrix.org
I know, that it's broken ... actually it would be good if someone upgrade it to the current version TensorRT-10.1.0.27
Would you like to just take over tensorrt in Nixpkgs?
11:28:58
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @ss:someonex.net
Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD
Yay
12:07:56
@ss:someonex.netSomeoneSerge (matrix works sometimes)clipboard.png
Download clipboard.png
12:08:03
@titus-von-koeller:matrix.orgTitus joined the room.12:52:05
@matthewcroughan:defenestrate.itmatthewcroughan
In reply to @ss:someonex.net
Would you like to just take over tensorrt in Nixpkgs?
I wouldn't wish that on my worst enemy
13:00:27
@titus-von-koeller:matrix.orgTitus

Hey! I just started using NixOS and I love it but have a MAJOR blocker, as I'm maintaining a FOSS deep learning package and can't get CUDA to work :( I would really love to continue on this journey and also eventually contribute to this community here, but right now it feels like I just shot myself in the foot badly, as I've spent the last days exclusively configuring NixOS only to reach a point which is seemingly unsurmountable for me.. The issue seems to be that PyTorch doesn't find the CUDA driver and what's also weird is that nvidia-smi seems to work fine, but shows CUDA Version: ERR!

The thing is that in order to work with my collaborators, I need to work in a non NixOS way, in my case I would like to use pixi which is very much like conda/micromamba, just better.. Therefore, I'm trying to get things working in an FHS shell. Does one of you have an idea? Am I doing anything obvious wrong?

from my configuration.nix

  hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
  };


  # Allow unfree packages
  nixpkgs.config.allowUnfree = true;
  services.xserver.videoDrivers = ["nvidia"];

  hardware.nvidia = {
    modesetting.enable = true;
    powerManagement.enable = false;
    powerManagement.finegrained = false;
    open = false;
    package = config.boot.kernelPackages.nvidiaPackages.beta;
  };

pixi-fhs.nix

{ pkgs, unstable }:

let
cudatoolkit = pkgs.cudaPackages.cudatoolkit_12_1;
nvidia_x11 = pkgs.nvidia_x11;
in
pkgs.buildFHSUserEnv {
name = "pixi-env";
targetPkgs = pkgs: with pkgs; [
unstable.pixi
cudatoolkit
nvidia_x11
# bashInteractive
# bash-completion
# complete-alias
];
runScript = "bash";
profile = ''
export NVIDIA_DRIVER_CAPABILITIES=compute,utility
export XDG_CONFIG_DIRS=${nvidia_x11}/share/X11/xorg.conf.d''${XDG_CONFIG_DIRS:+:}$XDG_CONFIG_DIRS
export XDG_DATA_DIRS=${nvidia_x11}/share''${XDG_DATA_DIRS:+:}$XDG_DATA_DIRS

export LD_LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:${cudatoolkit}/lib64/stubs''${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH
export CUDA_PATH=${cudatoolkit}
export PATH=${cudatoolkit}/bin:$PATH
export LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:$LIBRARY_PATH

export CPLUS_INCLUDE_PATH="${cudatoolkit}/include''${CPLUS_INCLUDE_PATH:+:$CPLUS_INCLUDE_PATH}"
export C_INCLUDE_PATH="${cudatoolkit}/include''${C_INCLUDE_PATH:+:$C_INCLUDE_PATH}"

# Pixi completion -- not working yet, due to missing `complete` command
eval "$(pixi completion --shell bash 2>/dev/null)"

echo "*** Pixi environment activated, using $(which pixi). ***"

'';
}


Thanks in advance <3
13:00:34
@titus-von-koeller:matrix.orgTitus *

Hey! I just started using NixOS and I love it but have a MAJOR blocker, as I'm maintaining a FOSS deep learning package and can't get CUDA to work :( I would really love to continue on this journey and also eventually contribute to this community here, but right now it feels like I just shot myself in the foot badly, as I've spent the last days exclusively configuring NixOS only to reach a point which is seemingly unsurmountable for me.. The issue seems to be that PyTorch doesn't find the CUDA driver and what's also weird is that nvidia-smi seems to work fine, but shows CUDA Version: ERR!

The thing is that in order to work with my collaborators, I need to work in a non NixOS way, in my case I would like to use pixi which is very much like conda/micromamba, just better.. Therefore, I'm trying to get things working in an FHS shell. Does one of you have an idea? Am I doing anything obvious wrong?

from my configuration.nix

  hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
  };


  # Allow unfree packages
  nixpkgs.config.allowUnfree = true;
  services.xserver.videoDrivers = ["nvidia"];

  hardware.nvidia = {
    modesetting.enable = true;
    powerManagement.enable = false;
    powerManagement.finegrained = false;
    open = false;
    package = config.boot.kernelPackages.nvidiaPackages.beta;
  };

pixi-fhs.nix


{ pkgs, unstable }:

let
cudatoolkit = pkgs.cudaPackages.cudatoolkit\_12\_1;
nvidia\_x11 = pkgs.nvidia\_x11;
in
pkgs.buildFHSUserEnv {
name = "pixi-env";
targetPkgs = pkgs: with pkgs; \[
unstable.pixi
cudatoolkit
nvidia\_x11
# bashInteractive
# bash-completion
# complete-alias
\];
runScript = "bash";
profile = ''
export NVIDIA\_DRIVER\_CAPABILITIES=compute,utility
export XDG\_CONFIG\_DIRS=${nvidia\_x11}/share/X11/xorg.conf.d''${XDG\_CONFIG\_DIRS:+:}$XDG\_CONFIG\_DIRS
export XDG\_DATA\_DIRS=${nvidia\_x11}/share''${XDG\_DATA\_DIRS:+:}$XDG\_DATA\_DIRS

export LD_LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:${cudatoolkit}/lib64/stubs''${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH
export CUDA_PATH=${cudatoolkit}
export PATH=${cudatoolkit}/bin:$PATH
export LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:$LIBRARY_PATH

export CPLUS_INCLUDE_PATH="${cudatoolkit}/include''${CPLUS_INCLUDE_PATH:+:$CPLUS_INCLUDE_PATH}"
export C_INCLUDE_PATH="${cudatoolkit}/include''${C_INCLUDE_PATH:+:$C_INCLUDE_PATH}"

Pixi completion -- not working yet, due to missing complete command

eval "$(pixi completion --shell bash 2>/dev/null)"

echo "*** Pixi environment activated, using $(which pixi). ***"


'';
}

Thanks in advance <3

13:00:55

Show newer messages


Back to Room ListRoom Version: 9