| 16 Sep 2022 |
Suwon Park | Hello everyone! | 17:33:48 |
Suwon Park | Is there anyone who has tried building pytorch with cuda enabled? | 17:34:56 |
Suwon Park | When I tried to build pytorch, this error occurs.
| 17:36:51 |
Suwon Park | Redacted or Malformed Event | 17:36:59 |
Suwon Park | * When I tried to build pytorch, this error occurs in the cmake phase.
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6")
| 17:37:25 |
Suwon Park | So I looked up the source of pytorch in nixpkgs and there was no cudaPackages.cuda_cudart in buildInputs. | 17:38:18 |
Suwon Park | cudaPackages.cuda_cudart seems to fix the error, but is there any reason why it is not there? | 17:39:17 |
SomeoneSerge (back on matrix) | python3Packages.pytorch currently still uses the older cudaPackages.cudatoolkit expression, which ships a lot of stuff, including cuda_cudart | 17:39:58 |
SomeoneSerge (back on matrix) | cuda_* expressions are preferred | 17:40:20 |
SomeoneSerge (back on matrix) | I didn't catch this though. Are you getting this when manually building pytorch, or when running nix build? | 17:41:52 |
Suwon Park | I'm running nix develop genericBuild! | 17:49:08 |
Suwon Park | * I'm running nix develop and genericBuild with poetry inside flake.nix! | 17:49:37 |
Suwon Park | * I'm running nix develop and genericBuild with flake.nix! | 17:49:47 |
Suwon Park | In reply to @ss:someonex.net
python3Packages.pytorch currently still uses the older cudaPackages.cudatoolkit expression, which ships a lot of stuff, including cuda_cudart But, if you check pytorch in 22.05 version of nixpkgs, the following code already exists
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
which means that cudaPackages.cudatoolkit expression does not ship with cudart!
| 17:55:09 |
Suwon Park | In reply to @ss:someonex.net
python3Packages.pytorch currently still uses the older cudaPackages.cudatoolkit expression, which ships a lot of stuff, including cuda_cudart * But, if you check cudaPackages in 22.05 version of nixpkgs, the following code already exists
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
which means that cudaPackages.cudatoolkit expression does not ship with cudart!
| 17:56:11 |
Suwon Park | * But, if you check cudaPackages.cudatoolkit in 22.05 version of nixpkgs, the following code already exists
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
which means that cudaPackages.cudatoolkit expression does not ship with cudart!
| 17:56:20 |
Suwon Park | If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
And in the end if I understood the code correctly, that ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesnt have libcuda.*` because of the above code I mentioned. Am I right..?🤔
| 18:07:46 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
That's because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesnt have libcuda.*` because of the above code I mentioned. Am I right..?🤔
| 18:08:28 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
That's because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn``t have libcuda.* because of the above code I mentioned. Am I right..?🤔
| 18:08:53 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
That's because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of the above code I mentioned. Am I right..?🤔
| 18:09:06 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
That's because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of the above code I mentioned. Am I right..?🤔 But in the github history, it seems like there was no problem building the package without cuda_cudart which means that I'm probably doing something wrong or unnecessary.
| 18:13:12 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
The above code cause error because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
I mentioned. Am I right..?🤔
But in the github history, it seems like there was no problem building the package without `cuda_cudart` which means that I'm probably doing something wrong or unnecessary.
| 18:14:59 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
The above code cause error because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
I mentioned. Am I right..?🤔
But in the github history, it seems like there was no problem building the package without `cuda_cudart` which means that I'm probably doing something wrong or unnecessary.
| 18:15:22 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
The above code cause error because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
I mentioned. Am I right..?🤔
But in the github history, it seems like there was no problem building the package without `cuda_cudart` which means that I'm probably doing something wrong or unnecessary.
| 18:15:52 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
The above code cause error because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of the following code
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
I just mentioned. Am I right..?🤔 But in the github history, it seems like there was no problem building the package without cuda_cudart which means that I'm probably doing something wrong or unnecessary.
| 18:16:20 |
Suwon Park | * If you unpack python39Packages.pytorch (current version : 1.11.0), and go to source/cmake/Modules_CUDA_fix/upstream/FindCUDA.cmake line 1128, there is the following code block which creates -- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "11.6") error.
find_package_handle_standard_args(CUDA
REQUIRED_VARS
CUDA_TOOLKIT_ROOT_DIR
CUDA_NVCC_EXECUTABLE
CUDA_INCLUDE_DIRS
${CUDA_CUDART_LIBRARY_VAR}
VERSION_VAR
CUDA_VERSION
)
The above code block causes error because in the end, if I understood the code correctly, ${CUDA_CUDART_LIBRARY_VAR} looks for libcudart.so inside cudaPackages.cudatoolkit which now doesn`t have libcuda.* because of the following code
# Move some libraries to the lib output so that programs that
# depend on them don't pull in this entire monstrosity.
mkdir -p $lib/lib
mv -v $out/lib64/libcudart* $lib/lib/
I just mentioned. Am I right..?🤔 But in the github history, it seems like there was no problem building the package without cuda_cudart which means that I'm probably doing something wrong or unnecessary.
| 18:21:50 |
SomeoneSerge (back on matrix) | pytorch derivation uses symlinkJoin which includes contents of cudatoolkit.out and cudatoolkit.lib | 20:54:46 |
Suwon Park | Someone S: Aha! Let me try some modification! Thank you! | 21:10:06 |
Suwon Park | * Someone S: Aha! Let me try some modifications! Thank you! | 21:10:18 |
SomeoneSerge (back on matrix) | cudatoolkit.{out,lib} bring in a lot (4-5GiB) of luggage; if you'd like to get rid of it later, maybe you could start with https://github.com/NixOS/nixpkgs/blob/befe56a1ee1d383fafaf9db41e3f4fc506578da1/pkgs/development/python-modules/pytorch/default.nix#L57 | 21:14:47 |