| 27 Jun 2024 |
matthewcroughan | https://github.com/rapidsai/cudf/issues/8018 | 12:21:36 |
matthewcroughan | seems like the solution to cicc getting killed is to limit parallelism also | 12:21:48 |
coruscate | In reply to @ss:someonex.net Just use cudaPackages_11_5. Start ad hoc, building only opensycl against it. If you wish to rebuild the whole package set, use an overlay. I seem to be too dumb to figure this out.
I use this flake:
{
description = "Needleman Wunsch Sycl implementation";
inputs = {
nixpkgs.url = "github:nixos/nixpkgs";
systems.url = "github:nix-systems/default";
flake-utils.url = "github:numtide/flake-utils";
};
outputs =
{ self
, systems
, nixpkgs
, flake-utils
}:
flake-utils.lib.eachDefaultSystem
(system:
let
pkgs = (import nixpkgs {
system = system;
config = {
cudaPackages = pkgs.cudaPackages_11_5;
cudaForwardCompat = true;
cudaCapabilities = [ "7.5" ];
cudaSupport = true;
allowUnfree = true; # Enable unfree software
};
});
in
{
# overlay = overlay;
devShells.default = import ./shell.nix { inherit pkgs; };
packages = {
default = pkgs.callPackage ./package.nix { inherit pkgs; };
sycl = pkgs.callPackage ./opensycl.nix { inherit pkgs; };
};
});
}
and expect both the shell and the sycl package to use the correct version, this clearly is not the case though. I expect that I can set it the way the documentation leads me to believe for the sycl package, as I use the callPackage function, but how would I do the same for the shell?
{ pkgs ? import <nixpkgs> }:
with pkgs;
mkShell {
buildInputs = [
(callPackage ./opensycl.nix { })
gnumake
gdb
lld
cudaPackages.cudatoolkit
linuxPackages.nvidia_x11
opencl-headers
pkg-config
];
...
sorry if this is too incoherent or stupid to begin with
| 13:17:07 |
coruscate | In reply to @ss:someonex.net Just use cudaPackages_11_5. Start ad hoc, building only opensycl against it. If you wish to rebuild the whole package set, use an overlay. * I seem to be too dumb to figure this out.
I use this flake:
{
description = "Needleman Wunsch Sycl implementation";
inputs = {
nixpkgs.url = "github:nixos/nixpkgs";
systems.url = "github:nix-systems/default";
flake-utils.url = "github:numtide/flake-utils";
};
outputs =
{ self
, systems
, nixpkgs
, flake-utils
}:
flake-utils.lib.eachDefaultSystem
(system:
let
pkgs = (import nixpkgs {
system = system;
config = {
cudaPackages = pkgs.cudaPackages_11_5;
cudaForwardCompat = true;
cudaCapabilities = [ "7.5" ];
cudaSupport = true;
allowUnfree = true; # Enable unfree software
};
});
in
{
# overlay = overlay;
devShells.default = import ./shell.nix { inherit pkgs; };
packages = {
default = pkgs.callPackage ./package.nix { inherit pkgs; };
sycl = pkgs.callPackage ./opensycl.nix { inherit pkgs; };
};
});
}
and expect both the shell and the sycl package to use the correct version, this clearly is not the case though. I expect that I can set it the way the documentation leads me to believe for the sycl package, as I use the callPackage function, but how would I do the same for the shell?
{ pkgs ? import <nixpkgs> }:
with pkgs;
mkShell {
buildInputs = [
(callPackage ./opensycl.nix { })
gnumake
gdb
lld
cudaPackages.cudatoolkit
linuxPackages.nvidia_x11
opencl-headers
pkg-config
];
...
sorry if this is too incoherent or stupid to begin with
| 13:17:20 |
SomeoneSerge (matrix works sometimes) | There's no config.cudaPackages option; cudaPackages is an attribute in an evaluated pkgs instance; it can be overridden using overlays | 13:41:48 |
SomeoneSerge (matrix works sometimes) | coruscate is there a public repo with the flake and the opensycl.nix? | 13:42:37 |
SomeoneSerge (matrix works sometimes) | Looking at how setup-cuda-hook.sh propagates itself and once again I cannot remember where the extra offset of (0, 1) is coming from... | 21:28:55 |
SomeoneSerge (matrix works sometimes) | Say someone puts a cuda_cudart in buildInputs (that's (0,1)), it has the hook in propagatedNativeBuildInputs (that's (-1, 0), right?). The expectation (confirmed by a successful build is that we arrive at (-1, 0)=nativeBuildInputs again, but the arithmetics says (0, 0) | 21:31:06 |
SomeoneSerge (matrix works sometimes) | * Looking at how setup-cuda-hook.sh propagates itself and once again I cannot remember where the extra offset of (1, 0) is coming from... | 21:31:28 |
SomeoneSerge (matrix works sometimes) | Is it added manually? Hmm now come to think of it if a is in buildInputs, and has b in propagatedBuildInputs, which has c in propagatedBuildINputs - c should end up at the same offsets, i.e. in buildInputs | 21:32:31 |
SomeoneSerge (matrix works sometimes) | * Is it added manually? Hmm now come to think of it if a is in buildInputs, and has b in propagatedBuildInputs, which has c in propagatedBuildINputs - we want c to end up at the same offsets, i.e. in buildInputs | 21:32:46 |
| 28 Jun 2024 |
SomeoneSerge (matrix works sometimes) | Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bump a nixpgks-review? xD | 01:43:56 |
SomeoneSerge (matrix works sometimes) | * Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD | 01:44:00 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD Omg, nevermind... I checked that magma still builds after the first commit, then did something in the second and now it doesn't | 01:49:25 |
| Howard Nguyen-Huu joined the room. | 02:44:51 |
search-sense | In reply to @matthewcroughan:defenestrate.it they removed the .6 from the release I know, that it's broken ... actually it would be good if someone upgrade it to the current version TensorRT-10.1.0.27 | 11:00:16 |
SomeoneSerge (matrix works sometimes) | Shoot, I think propagatedBuildOutputs are broken with __structuredAttrs | 11:08:29 |
SomeoneSerge (matrix works sometimes) | The hook loops over $propagatedBuildOutputs but __structuredAttrs make it onti an array, so the first expression resolves into the value of the first element 🤡 | 11:09:03 |
SomeoneSerge (matrix works sometimes) | In reply to @search-sense:matrix.org I know, that it's broken ... actually it would be good if someone upgrade it to the current version TensorRT-10.1.0.27 Would you like to just take over tensorrt in Nixpkgs? | 11:28:58 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net Should work this time: https://github.com/NixOS/nixpkgs/pull/323056
Can I bum a nixpgks-review? xD Yay | 12:07:56 |
SomeoneSerge (matrix works sometimes) |  Download clipboard.png | 12:08:03 |
| Titus joined the room. | 12:52:05 |
matthewcroughan | In reply to @ss:someonex.net Would you like to just take over tensorrt in Nixpkgs? I wouldn't wish that on my worst enemy | 13:00:27 |
Titus | Hey! I just started using NixOS and I love it but have a MAJOR blocker, as I'm maintaining a FOSS deep learning package and can't get CUDA to work :( I would really love to continue on this journey and also eventually contribute to this community here, but right now it feels like I just shot myself in the foot badly, as I've spent the last days exclusively configuring NixOS only to reach a point which is seemingly unsurmountable for me.. The issue seems to be that PyTorch doesn't find the CUDA driver and what's also weird is that nvidia-smi seems to work fine, but shows CUDA Version: ERR!
The thing is that in order to work with my collaborators, I need to work in a non NixOS way, in my case I would like to use pixi which is very much like conda/micromamba, just better.. Therefore, I'm trying to get things working in an FHS shell. Does one of you have an idea? Am I doing anything obvious wrong?
from my configuration.nix
hardware.opengl = {
enable = true;
driSupport = true;
driSupport32Bit = true;
};
# Allow unfree packages
nixpkgs.config.allowUnfree = true;
services.xserver.videoDrivers = ["nvidia"];
hardware.nvidia = {
modesetting.enable = true;
powerManagement.enable = false;
powerManagement.finegrained = false;
open = false;
package = config.boot.kernelPackages.nvidiaPackages.beta;
};
pixi-fhs.nix
{ pkgs, unstable }:
let cudatoolkit = pkgs.cudaPackages.cudatoolkit_12_1; nvidia_x11 = pkgs.nvidia_x11; in pkgs.buildFHSUserEnv { name = "pixi-env"; targetPkgs = pkgs: with pkgs; [ unstable.pixi cudatoolkit nvidia_x11 # bashInteractive # bash-completion # complete-alias ]; runScript = "bash"; profile = '' export NVIDIA_DRIVER_CAPABILITIES=compute,utility export XDG_CONFIG_DIRS=${nvidia_x11}/share/X11/xorg.conf.d''${XDG_CONFIG_DIRS:+:}$XDG_CONFIG_DIRS export XDG_DATA_DIRS=${nvidia_x11}/share''${XDG_DATA_DIRS:+:}$XDG_DATA_DIRS
export LD_LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:${cudatoolkit}/lib64/stubs''${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH
export CUDA_PATH=${cudatoolkit}
export PATH=${cudatoolkit}/bin:$PATH
export LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:$LIBRARY_PATH
export CPLUS_INCLUDE_PATH="${cudatoolkit}/include''${CPLUS_INCLUDE_PATH:+:$CPLUS_INCLUDE_PATH}"
export C_INCLUDE_PATH="${cudatoolkit}/include''${C_INCLUDE_PATH:+:$C_INCLUDE_PATH}"
# Pixi completion -- not working yet, due to missing `complete` command
eval "$(pixi completion --shell bash 2>/dev/null)"
echo "*** Pixi environment activated, using $(which pixi). ***"
''; }
Thanks in advance <3
| 13:00:34 |
Titus | * Hey! I just started using NixOS and I love it but have a MAJOR blocker, as I'm maintaining a FOSS deep learning package and can't get CUDA to work :( I would really love to continue on this journey and also eventually contribute to this community here, but right now it feels like I just shot myself in the foot badly, as I've spent the last days exclusively configuring NixOS only to reach a point which is seemingly unsurmountable for me.. The issue seems to be that PyTorch doesn't find the CUDA driver and what's also weird is that nvidia-smi seems to work fine, but shows CUDA Version: ERR!
The thing is that in order to work with my collaborators, I need to work in a non NixOS way, in my case I would like to use pixi which is very much like conda/micromamba, just better.. Therefore, I'm trying to get things working in an FHS shell. Does one of you have an idea? Am I doing anything obvious wrong?
from my configuration.nix
hardware.opengl = {
enable = true;
driSupport = true;
driSupport32Bit = true;
};
# Allow unfree packages
nixpkgs.config.allowUnfree = true;
services.xserver.videoDrivers = ["nvidia"];
hardware.nvidia = {
modesetting.enable = true;
powerManagement.enable = false;
powerManagement.finegrained = false;
open = false;
package = config.boot.kernelPackages.nvidiaPackages.beta;
};
pixi-fhs.nix
{ pkgs, unstable }:
let
cudatoolkit = pkgs.cudaPackages.cudatoolkit\_12\_1;
nvidia\_x11 = pkgs.nvidia\_x11;
in
pkgs.buildFHSUserEnv {
name = "pixi-env";
targetPkgs = pkgs: with pkgs; \[
unstable.pixi
cudatoolkit
nvidia\_x11
# bashInteractive
# bash-completion
# complete-alias
\];
runScript = "bash";
profile = ''
export NVIDIA\_DRIVER\_CAPABILITIES=compute,utility
export XDG\_CONFIG\_DIRS=${nvidia\_x11}/share/X11/xorg.conf.d''${XDG\_CONFIG\_DIRS:+:}$XDG\_CONFIG\_DIRS
export XDG\_DATA\_DIRS=${nvidia\_x11}/share''${XDG\_DATA\_DIRS:+:}$XDG\_DATA\_DIRS
export LD_LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:${cudatoolkit}/lib64/stubs''${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH export CUDA_PATH=${cudatoolkit} export PATH=${cudatoolkit}/bin:$PATH export LIBRARY_PATH=${cudatoolkit}/lib:${cudatoolkit}/lib64:$LIBRARY_PATH
export CPLUS_INCLUDE_PATH="${cudatoolkit}/include''${CPLUS_INCLUDE_PATH:+:$CPLUS_INCLUDE_PATH}" export C_INCLUDE_PATH="${cudatoolkit}/include''${C_INCLUDE_PATH:+:$C_INCLUDE_PATH}"
Pixi completion -- not working yet, due to missing complete command
eval "$(pixi completion --shell bash 2>/dev/null)"
echo "*** Pixi environment activated, using $(which pixi). ***"
'';
}
Thanks in advance <3
| 13:00:55 |
Titus | File "/home/titus/src/bnb/bitsandbytes/diagnostics/main.py", line 66, in main
sanity_check()
File "/home/titus/src/bnb/bitsandbytes/diagnostics/main.py", line 33, in sanity_check
p = torch.nn.Parameter(torch.rand(10, 10).cuda())
File "/home/titus/src/bnb/.pixi/envs/default/lib/python3.8/site-packages/torch/cuda/__init__.py", line 293, in _lazy_init
torch._C._cuda_init()
| 13:06:53 |
SomeoneSerge (matrix works sometimes) |
pixi
Is one of the package managers that ship incomplete dependencies because they expect an FHS environment. If you want to use it on NixOS I recommend you use nix-ld. You'll aslo need to ensure (using a shell e..g) that the ld.so is aware of ${addDriverRunpath.driverLink}/lib, which you can also do as part of the nix-ld configuration. E.g. you can deploy a nixos with programs.nix-ld.enable and then, in your project tree, use a nix shell that looks something like the following: https://github.com/NixOS/nixpkgs/blob/48dbb2ae90be0ba21b44e77b8278fd7cefb4b75f/nixos/doc/manual/configuration/fhs.chapter.md?plain=1#L105-L113 | 13:09:09 |
SomeoneSerge (matrix works sometimes) | pkgs.buildFHSUserEnv {
name = "pixi-env";
targetPkgs = pkgs: with pkgs; [
...
nvidia_x11
...
];
...this, if it had any affect on the dynamic loader (in this form it doesn't, instead it provides hints for the compiler), would conflict with the libcuda driver deployed by NixOS. NVidia makes it so that the driver has to be deployed impurely, because each libcuda only works with the corresponding kernel. TLDR: delete nvidia_x11 from that list | 13:12:14 |
SomeoneSerge (matrix works sometimes) | Also note that cudaPackages.cudatoolkit is a package for development; e.g. if pixi runs any builds (idk if it does) and you want it to use nixpkgs' cudatoolkit libraries instead of pixi libraries, that's when you include it int he shell | 13:13:27 |
SomeoneSerge (matrix works sometimes) | * Also note that cudaPackages.cudatoolkit is a package for development; e.g. if pixi runs any builds (idk if it does) and you want it to use nixpkgs' cudatoolkit libraries instead of pixi libraries, that's when you include it in the shell | 13:13:30 |