| 26 Nov 2022 |
hexa | can someone review this? | 16:56:59 |
hexa | working on torchaudio and it would be neat to have this | 16:57:15 |
| ahmed changed their display name from rh to ahmed. | 19:19:35 |
| 29 Nov 2022 |
tpw_rules | hexa: running a nixpkgs-review cycle including CUDA stuff. planning to try the test suite too. expect something by tomorrow evening but i don't expect any major issues | 05:05:55 |
tpw_rules | thank you for your patience | 05:06:53 |
Samuel Ainsworth | draft CUDA 11.8 upgrade: https://github.com/NixOS/nixpkgs/pull/203658 | 20:49:35 |
Samuel Ainsworth | for some reason jaxlib and tensorflow are not building with 11.8... anyone have any ideas? | 20:49:56 |
tpw_rules | i thought Someone S had a draft too and he got similar errors | 20:58:37 |
Samuel Ainsworth | oh that's enitrely possible... i was not aware | 23:50:55 |
Samuel Ainsworth | yup, you're totally right: https://github.com/NixOS/nixpkgs/pull/200435 | 23:51:40 |
Samuel Ainsworth | i'm really confused why we're seeing these errors... they seem to indicate that the directory structure changed between 11.7 -> 11.8 | 23:52:23 |
| 1 Dec 2022 |
@box1:matrix.org | I'm trying to package dgl-cu116(dgl with cuda support) and it fails to find rpath for libtorch_cuda_cpp.so and libtorch_cuda_cu.so.
After some searching, those files are generated under torch when it is built with BUILD_SPLIT_CUDA=1 or BUILD_SPLIT_CUDA=1. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864). This link says that BUILD_SPLIT_CUDA is not default because
there may be other side effects (like increased binary size) that users might not be expecting, and it's only when we are compiling for many architectures where we run into these linker issues.
Currently, [torch](https://github.com/NixOS/nixpkgs/blob/nixos-22.11/pkgs/development/python-modules/torch/default.nix) doesn't have an option to it. Maybe an option like mklDnnSupport so that it can be turned on for packages like dgl-cuda116 that needs those files would be great. Any thought on this?
| 11:44:06 |
@box1:matrix.org | * I'm trying to package dgl-cu116(dgl with cuda support) and it fails to find rpath for libtorch_cuda_cpp.so and libtorch_cuda_cu.so.
After some searching, those files are generated under torch when it is built with BUILD_SPLIT_CUDA=1 or BUILD_SPLIT_CUDA=1. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864). This link says that BUILD_SPLIT_CUDA is not default because
there may be other side effects (like increased binary size) that users might not be expecting, and it's only when we are compiling for many architectures where we run into these linker issues.
Currently, [torch](https://github.com/NixOS/nixpkgs/blob/nixos-22.11/pkgs/development/python-modules/torch/default.nix) doesn't have an option to it. Maybe an option like mklDnnSupport so that it can be turned on for packages like dgl-cuda116 that needs those files would be great. Any thought on this?
| 11:44:26 |
@box1:matrix.org | * I'm trying to package dgl-cu116(dgl with cuda support) and it fails to find rpath for libtorch_cuda_cpp.so and libtorch_cuda_cu.so.
After some searching, those files are generated under torch when it is built with BUILD_SPLIT_CUDA=ON or BUILD_SPLIT_CUDA=1. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864). This link says that BUILD_SPLIT_CUDA is not default because
there may be other side effects (like increased binary size) that users might not be expecting, and it's only when we are compiling for many architectures where we run into these linker issues.
Currently, [torch](https://github.com/NixOS/nixpkgs/blob/nixos-22.11/pkgs/development/python-modules/torch/default.nix) doesn't have an option to it. Maybe an option like mklDnnSupport so that it can be turned on for packages like dgl-cuda116 that needs those files would be great. Any thought on this?
| 11:44:48 |
@box1:matrix.org | * I'm trying to package dgl-cu116(dgl with cuda support) and it fails to find rpath for libtorch_cuda_cpp.so and libtorch_cuda_cu.so.
After some searching, those files are generated under torch when it is built with BUILD_SPLIT_CUDA=ON or BUILD_SPLIT_CUDA=1. (https://discuss.pytorch.org/t/no-libtorch-cuda-cpp-so-available-when-build-pytorch-from-source/159864). This link says that BUILD_SPLIT_CUDA is not default because
there may be other side effects (like increased binary size) that users might not be expecting, and it's only when we are compiling for many architectures where we run into these linker issues.
Currently, torch doesn't have an option to it. Maybe an option like mklDnnSupport so that it can be turned on for packages like dgl-cuda116 that needs those files would be great. Any thought on this?
| 11:45:26 |
| hexa changed their display name from hexa to hexa (22.11 now). | 13:09:03 |
| hexa changed their display name from hexa (22.11 now) to hexa. | 14:38:53 |
danielrf | Hi, I have some recent work that might be of interest to the Nix CUDA community: jetpack-nixos (https://github.com/anduril/jetpack-nixos)
See also this announcement post on the discourse: https://discourse.nixos.org/t/jetpack-nixos-nixos-module-for-nvidia-jetson-devices/23632 | 19:50:11 |
danielrf |
The CUDA version included with jetpack is apparently not the same as just the aarch64 CUDA for servers, but I've tried to repackage the debs from NVIDIA in a way similar to cudaPackages in nixpkgs: https://github.com/anduril/jetpack-nixos/blob/master/cuda-packages.nix | 19:50:23 |