!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

326 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda64 Servers

Load older messages


SenderMessageTime
24 Feb 2023
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @ss:someonex.net
connor (he/him): is there currently an open issue tracking this stdenv/compiler compatibility problem specifically?
Maybe rather than fixing cudaPackages.cudatoolkit.cc in non-redist cudatoolkit's versions.toml we should set a cudaPackages-wide default stdenv (e.g. cudaPackages.stdenv = gcc11Stdenv in case of pre-cuda-12). It seems like downstream packages do have to use that stdenv if they build any cuda kernels.
20:43:23
@ss:someonex.netSomeoneSerge (matrix works sometimes)

RE: opencv in BSRT as well as tensorflow and jax

Is there a chance we misinterpret the "wrong glibc" errors, them using auto-patchelf-ed non-redist cudatoolkit

20:46:07
@ss:someonex.netSomeoneSerge (matrix works sometimes) *

RE: opencv in BSRT as well as tensorflow and jax

Is there a chance we misinterpret the "wrong glibc" errors, them using auto-patchelf-ed non-redist cudatoolkit?

20:46:10
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @ss:someonex.net
connor (he/him): is there currently an open issue tracking this stdenv/compiler compatibility problem specifically?
* Maybe rather than fixing cudaPackages.cudatoolkit.cc in non-redist cudatoolkit's versions.toml we should set a cudaPackages-wide default stdenv (e.g. cudaPackages.stdenv = gcc11Stdenv in case of pre-cuda-12)? It seems like downstream packages do have to use that stdenv if they build any cuda kernels.
20:46:42
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @mcwitt:matrix.org

Is there an issue with cudaPackages since the the gcc version bump to 12? I'd expect the following to work

let pkgs = import ./. { };
in pkgs.runCommandCC "test" { buildInputs = with pkgs.cudaPackages; [ cuda_nvcc cuda_cudart ]; } ''
  nvcc ${pkgs.writeText "test.cu" "int main() { return 0; }"} -o $out
''

but on master I see

error -- unsupported GNU version! gcc versions later than 11 are not supported!

(might be missing something because I'm not immediately finding workarounds that were necessary for other CUDA packages since gcc was bumped)

* Just overriding this with gcc11Stdenv succeeds (same applies e.g. to faiss attribute in nixpkgs)
20:47:22
@connorbaker:matrix.orgconnor (he/him) A standard environment for CUDA would be really nice given that NVCC always has version constraints on the compiler
ALTERNATIVELY, if we didn't want to change anything else, we could add the NVCC flag --allow-unsupported-compiler (or something similar, I don't remember) and just build with whatever
20:49:12
@mcwitt:matrix.orgmcwitt Just to close the loop, the fix in my case was to set cmakeFlags = [ "-DCMAKE_CUDA_HOST_COMPILER ${cudaPackages.cudatoolkit.cc}/bin/cc" ] (and eventually found many examples of this in nixpkgs). Thanks for pointing me in the right direction! 21:34:02
@mcwitt:matrix.orgmcwitt * Just to close the loop, the fix in my case was to set cmakeFlags = [ "-DCMAKE_CUDA_HOST_COMPILER=${cudaPackages.cudatoolkit.cc}/bin/cc" ] (and eventually found many examples of this in nixpkgs). Thanks for pointing me in the right direction! 21:34:32
@connorbaker:matrix.orgconnor (he/him) I made a helper for nixpkgs-review workflows! After redirecting all the output to a file, the script takes all of the failing derivations, makes a gist for each build log, and makes a little markdown table in a comment on the PR you were review.
Script here: https://gist.github.com/ConnorBaker/b32a7f69d318e3f338b6b4fedeef37ef
Example comment here: https://github.com/NixOS/nixpkgs/pull/218035#issuecomment-1444682137
23:26:47
@connorbaker:matrix.orgconnor (he/him)Although, all these tools print with color to output even if it's a file, so there are escape characters in them :(23:28:54
25 Feb 2023
@ss:someonex.netSomeoneSerge (matrix works sometimes) Damn... I forget again, how do I make a command run after autoPatchelfHook?.. 11:32:03
@ss:someonex.netSomeoneSerge (matrix works sometimes) Appending to postFixup doesn't seem to do the trick 11:32:19
@ss:someonex.netSomeoneSerge (matrix works sometimes) Is there a reason we use glob in auto-patchelf.py? It skips hidden files, including files renamed by wrapProgram 11:39:11
@ss:someonex.netSomeoneSerge (matrix works sometimes) * Is there a reason we use glob in auto-patchelf.py? It seems it skips hidden files, including files renamed by wrapProgram 11:41:08
@ss:someonex.netSomeoneSerge (matrix works sometimes) * ~~Is there a reason we use glob in auto-patchelf.py? It seems it skips hidden files, including files renamed by wrapProgram~~ Nope, it doesn't skip anything, so idk why the renamed file doesn't get patched 11:42:15
@ss:someonex.netSomeoneSerge (matrix works sometimes) * ~~Is there a reason we use glob in auto-patchelf.py? It seems it skips hidden files, including files renamed by wrapProgram~~ Nope, it doesn't skip anything, so idk why the renamed file doesn't get patched 11:42:25
@ss:someonex.netSomeoneSerge (matrix works sometimes) * Appending to postFixup doesn't seem to do the trick 11:47:47
@ss:someonex.netSomeoneSerge (matrix works sometimes) * ~~Is there a reason we use glob in auto-patchelf.py? It seems it skips hidden files, including files renamed by wrapProgram~~ Nope, it doesn't skip anything, so idk why the renamed file doesn't get patched 11:47:57
@ss:someonex.netSomeoneSerge (matrix works sometimes)

RE: -allow-unsupported=compiler

Ok, I tried building faiss with that flag on, getting some gcc errors: /nix/store/9pgq84sf921xh97gjj2wh7a7clrcrh4m-gcc-12.2.0/include/c++/12.2.0/bits/random.h(104): error: expected a declaration

11:54:36
@ss:someonex.netSomeoneSerge (matrix works sometimes)I wouldn't go any further into that, I think we should just default to building downstream packages with gcc version dictated by nvcc11:56:02
@ss:someonex.netSomeoneSerge (matrix works sometimes) This is a bit of a pickle because downstream expressions expect stdenv as an argument and whenever we set cudaSupport = true we should override it 11:58:21
@ss:someonex.netSomeoneSerge (matrix works sometimes)

So, the ugly and straightforward version could look like this:

{ config
, stdenv
, ...
, cudaSupport ? config.cudaSupport or false
, cudaPackages
}:


(if cudaSupport then cudaPackages.stdenv else stdenv).mkDerivation { ... }
12:02:22
@ss:someonex.netSomeoneSerge (matrix works sometimes) *

So, the ugly and straightforward version could look like this:

{ config
, stdenv
, ...
, cudaSupport ? config.cudaSupport or false
, cudaPackages
}:


(if cudaSupport then cudaPackages.stdenv else stdenv).mkDerivation { ... }
12:02:28
@ss:someonex.netSomeoneSerge (matrix works sometimes)And I'm pretty sure nobody in nixpkgs would want to do that just because of cuda12:03:19
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @mcwitt:matrix.org
Just to close the loop, the fix in my case was to set cmakeFlags = [ "-DCMAKE_CUDA_HOST_COMPILER=${cudaPackages.cudatoolkit.cc}/bin/cc" ] (and eventually found many examples of this in nixpkgs). Thanks for pointing me in the right direction!
Hm, I should try that. For whatever reason I see that CUDA_HOST_COMPILER is set, not CMAKE_CUDA_HOST_COMPILER
12:06:03
@ss:someonex.netSomeoneSerge (matrix works sometimes)Redacted or Malformed Event12:09:40
@ss:someonex.netSomeoneSerge (matrix works sometimes) * btw, looking at what cmake reference says, it seems this variable should point to nvcc 🤔 12:16:01
@connorbaker:matrix.orgconnor (he/him) For what it's worth, some CMake projects don't respect those arguments (they will also print, at the the end of the configure phase, which arguments were not used).
I've had better luck setting CUDAHOSTCXX as an environment variable because it's one CMake looks at specifically, unless the CMakeLists.txt is written in such a way to prohibit it: https://cmake.org/cmake/help/latest/envvar/CUDAHOSTCXX.html?highlight=cudahostcxx
12:20:32
@ss:someonex.netSomeoneSerge (matrix works sometimes) Yea, many projects haven't migrated to FindCUDAToolkit yet 12:21:20
@connorbaker:matrix.orgconnor (he/him)

Three more things that popped into my head (sorry, I am actively consuming coffee):

  1. When we do override the C/C++ compilers by setting the CC/CXX environment variables, that doesn't change binutils, so (in my case) I still see ar/ranlib/ld and friends from gcc12 being used. Is that a problem? I don't know if version bumps to those tools can cause as much damage as libraries compiled with different language standards.
  2. If a package needs to link against libcuda.so specifically, what's the best way to make the linker aware of those stubs? I set LIBRARY_PATH and that seemed to do the trick: https://github.com/NixOS/nixpkgs/pull/218166/files#diff-ab3fb67b115c350953951c7c5aa868e8dd9694460710d2a99b845e7704ce0cf5R76
  3. Is it better to set environment variables as env.BLAG = "blarg" (I saw a tree-wide change about using env because of "structuredAttrs") in the derivation or to export them in the shell, in something like preConfigure?
12:26:25

Show newer messages


Back to Room ListRoom Version: 9