7 Jul 2024 |
| @wilhelmvon:matrix.org left the room. | 18:36:51 |
8 Jul 2024 |
SomeoneSerge (utc+3) |
CUDA_PATH, but it doesn't seem to help. http://numba.pydata.org/numba-doc/latest/cuda/overview.html#setting-cuda-installation-path talks
Hmm are those used at build time or at runtime for jit? | 16:05:39 |
SomeoneSerge (utc+3) | In reply to @jeroenvb3:matrix.org
Thank you very much. The previously mentioned file does indeed run succesfully now. I did want to enable cudatoolkit I'm pretty sure. This is what I am now trying to get running:
from numba import cuda
import numpy as np
@cuda.jit
def cudakernel0(array):
for i in range(array.size):
array[i] += 0.5
array = np.array([0, 1], np.float32)
print('Initial array:', array)
print('Kernel launch: cudakernel0[1, 1](array)')
cudakernel0[1, 1](array)
print('Updated array:',array)
Which has this as the first error:
Initial array: [0. 1.]
Kernel launch: cudakernel0[1, 1](array)
/nix/store/7m7c6crkdbzmzcrbwa4l4jqgnwj8m92b-python3.9-numba-0.59.1/lib/python3.9/site-packages/numba/cuda/dispatcher.py:536: NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
warn(NumbaPerformanceWarning(msg))
Traceback (most recent call last):
File "/nix/store/7m7c6crkdbzmzcrbwa4l4jqgnwj8m92b-python3.9-numba-0.59.1/lib/python3.9/site-packages/numba/cuda/cudadrv/nvvm.py", line 139, in __new__
inst.driver = open_cudalib('nvvm')
File "/nix/store/7m7c6crkdbzmzcrbwa4l4jqgnwj8m92b-python3.9-numba-0.59.1/lib/python3.9/site-packages/numba/cuda/cudadrv/libs.py", line 64, in open_cudalib
return ctypes.CDLL(path)
File "/nix/store/2j0l3b15gas78h9akrsfyx79q02i46hc-python3-3.9.19/lib/python3.9/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libnvvm.so: cannot open shared object file: No such file or directory
However, I do have the related env vars set:
[nix-shell:/tmp/cuda]$ echo $NUMBAPRO_NVVM
/nix/store/0c8nf26hx9x9jxgj0s9bq10xg75nbfv0-cuda-merged-12.2/nvvm/lib64/libnvvm.so
[nix-shell:/tmp/cuda]$ echo $NUMBAPRO_LIBDEVICE
/nix/store/0c8nf26hx9x9jxgj0s9bq10xg75nbfv0-cuda-merged-12.2/nvvm/libdevice
A stackoverflow says those are outdated and CUDA_HOME needs to be set. I do set it to the same as CUDA_PATH, but it doesn't seem to help. http://numba.pydata.org/numba-doc/latest/cuda/overview.html#setting-cuda-installation-path talks about ignoring minor version paths, but I don't think that is when I directly set it. I also don't have any non-minor version paths in /nix/store/
This is now my shell.nix:
with import <nixpkgs> {
config = {
allowUnfree = true;
cudaSupport = true;
};
};
pkgs.mkShell {
name = "cuda-env-shell";
buildInputs = [
git gitRepo gnupg autoconf curl
procps gnumake util-linux m4 gperf unzip
libGLU libGL
xorg.libXi xorg.libXmu freeglut
xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
ncurses5 stdenv.cc binutils
python39
python39Packages.numpy
python39Packages.numba
libstdcxx5
cudaPackages.cudatoolkit
];
shellHook = ''
export CUDA_PATH=${pkgs.cudaPackages.cudatoolkit}
export CUDA_HOME=${pkgs.cudatoolkit}
export EXTRA_CCFLAGS="-I/usr/include"
export NUMBAPRO_NVVM=${pkgs.cudatoolkit}/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=${pkgs.cudatoolkit}/nvvm/libdevice
'';
}
I'm sorry if its a lot to ask, but I would really like to learn about this and get it working. Do you see anything wrong with my stuff still?
AFAIU it dlopen() s with the soname rather than an absolute path, which is probably good for us | 16:06:41 |
SomeoneSerge (utc+3) | You could try exposing libnvvm in LD_LIBRARY_PATH | 16:06:49 |
SomeoneSerge (utc+3) | It would be valuable if we managed to prepare runtime tests for numba/numba+cuda | 16:07:51 |
SomeoneSerge (utc+3) |
Do you see anything wrong with my stuff still?
I'm still unsure if all of the buildInputs are actually required for whatever your task is. Aside from that, it's probably best that you create a python wrapper (python3.withPackages (ps: with ps; [ numba ... ]) ) rather than rely on mkShell or whatever | 16:09:39 |
SomeoneSerge (utc+3) | If you build cudaPackages.cudatoolkit you'll see that it's actually a symlink farm which includes a lot of stuff you don't need. In particular:
❯ readlink -f result/nvvm/lib64/libnvvm.so.4.0.0
/nix/store/fby2d6b4jgfb8awwjhzdrd13r8vx7ilw-cuda_nvcc-12.2.140-bin/nvvm/lib64/libnvvm.so.4.0.0
...it would've been enough for you to use ${getBin cudaPackages.cuda_nvcc}/nvvm/lib64/libnvvm.so which is much smaller (whoops, that's a bug, we definitely didn't want this to end up in .bin , so expect this to change) | 16:12:09 |
9 Jul 2024 |
| ghishadow joined the room. | 04:20:35 |
jeroenvb3 | I would think runtime for jit, since there is a @cuda.jit operator on my python code. This is now my shell.nix, removed extra buildInputs, and used the wrapper, which had to be python39 to even get back to where I was:
config = {
allowUnfree = true;
cudaSupport = true;
};
};
let
pythonEnv = pkgs.python39.withPackages(ps: with ps; [ numba numpy ]);
in
pkgs.mkShell {
name = "cuda-env-shell";
buildInputs = [
pythonEnv
libstdcxx5
cudaPackages.cudatoolkit
];
shellHook = ''
export CUDA_PATH=${pkgs.cudaPackages.cudatoolkit}
export CUDA_HOME=${pkgs.cudatoolkit}
export EXTRA_CCFLAGS="-I/usr/include"
export NUMBAPRO_NVVM=${pkgs.cudatoolkit}/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=${pkgs.cudatoolkit}/nvvm/libdevice
export LD_LIBRARY_PATH=${pkgs.cudatoolkit}/nvvm/lib64/libnvvm.so:$CUDA_HOME
# source venv/bin/activate
python test2.py
'';
}
If I go into python repl I can import numpy and numba, but I still get the same error when running an actual cuda script. This was the python code:
from numba import cuda
import numpy as np
@cuda.jit
def cudakernel0(array):
for i in range(array.size):
array[i] += 0.5
array = np.array([0, 1], np.float32)
print('Initial array:', array)
print('Kernel launch: cudakernel0[1, 1](array)')
cudakernel0[1, 1](array)
print('Updated array:',array)
I can run:
from numba import cuda
print(cuda.detect())
Also I doubt the linking of nvvm.so helps, since numba is now relying on the CUDA_HOME env var (https://numba.pydata.org/numba-doc/latest/cuda/overview.html). What do you mean to use ${getBin...}, where could I use that?
| 10:05:31 |
SomeoneSerge (utc+3) |
export LD_LIBRARY_PATH=${pkgs.cudatoolkit}/nvvm/lib64/libnvvm.so:$CUDA_HOME
I'm pretty sure gilbc treats LD_LIBRARY_PATH as a list of directories only (so you'd want LD_LIBRARY_PATH=${pkgs.cudatoolkit}/nvvm/lib64/ instead) | 10:42:21 |
SomeoneSerge (utc+3) |
Also I doubt the linking of nvvm.so helps, since numba is now relying on the CUDA_HOME env var
The error you posted earlier suggests that they just dlopen("libnvvm.so", ...) | 10:42:57 |
jeroenvb3 | In reply to @ss:someonex.net
export LD_LIBRARY_PATH=${pkgs.cudatoolkit}/nvvm/lib64/libnvvm.so:$CUDA_HOME
I'm pretty sure gilbc treats LD_LIBRARY_PATH as a list of directories only (so you'd want LD_LIBRARY_PATH=${pkgs.cudatoolkit}/nvvm/lib64/ instead) That was true yes, it seems it does find it now. However it claims:
No supported GPU compute capabilities found. Please check your cudatoolkit version matches your CUDA version
nvcc --version gives 12.2, and nvidia-smi gives 12.4. This should be compatible as far as I'm aware. I'll see if I can get more info on this, unless you know that this is not supposed to be compatible and I need to adjust one of the versions.
| 12:26:13 |
SomeoneSerge (utc+3) | In reply to @jeroenvb3:matrix.org
That was true yes, it seems it does find it now. However it claims:
No supported GPU compute capabilities found. Please check your cudatoolkit version matches your CUDA version
nvcc --version gives 12.2, and nvidia-smi gives 12.4. This should be compatible as far as I'm aware. I'll see if I can get more info on this, unless you know that this is not supposed to be compatible and I need to adjust one of the versions.
Could you gist a reproducing code and a full log? | 12:30:02 |
10 Jul 2024 |
jeroenvb3 | In reply to @ss:someonex.net Could you gist a reproducing code and a full log? https://gitlab.com/jeroenvb3/cuda-setup
I added the nix config, the python code, the output, and my nvidia driver info. If you need more system info please tell but I'd think this is all that will be unique. Thanks!
| 10:23:54 |
12 Jul 2024 |
| @valconius:matrix.org left the room. | 01:16:43 |
15 Jul 2024 |
| Hannes joined the room. | 01:58:17 |
| dminca changed their display name from dminca to nixpkgs. | 17:28:58 |
| dminca changed their display name from nixpkgs to dminca. | 17:42:37 |
18 Jul 2024 |
| tewi 🏳️⚧️ joined the room. | 00:23:50 |
| tewi 🏳️⚧️ changed their profile picture. | 23:50:51 |
| tewi 🏳️⚧️ removed their profile picture. | 23:52:42 |
| tewi 🏳️⚧️ set a profile picture. | 23:53:18 |
22 Jul 2024 |
| Luke joined the room. | 22:20:25 |
23 Jul 2024 |
| Ezzobir Bezziou joined the room. | 08:19:39 |
24 Jul 2024 |
| Redstone changed their display name from redstone-menace to Redstone. | 10:16:13 |
| @lambadada:matrix.org left the room. | 23:57:16 |
29 Jul 2024 |
| Jeff joined the room. | 19:29:43 |
1 Aug 2024 |
Bruno Rodrigues | hello , we could use some testers on NixOS that have an nvidia GPU
would really appreciate if you could review this PR: https://github.com/NixOS/nixpkgs/pull/328980
especially following the instructions in the documentation: https://github.com/NixOS/nixpkgs/blob/7c2dcd41c35833809f05108c75ca0a6e3df242ce/doc/languages-frameworks/r.section.md#torch-with-gpu-acceleration-torch-gpu-accel
question is, do we need nixglhost even on NixOs ? some first testing seems to indicate that yes | 19:16:35 |
Bruno Rodrigues | or if you have a Mac you could help get and test the right binary for Darwin | 19:20:41 |
3 Aug 2024 |
SomeoneSerge (utc+3) | In reply to @brodriguesco:matrix.org
hello , we could use some testers on NixOS that have an nvidia GPU
would really appreciate if you could review this PR: https://github.com/NixOS/nixpkgs/pull/328980
especially following the instructions in the documentation: https://github.com/NixOS/nixpkgs/blob/7c2dcd41c35833809f05108c75ca0a6e3df242ce/doc/languages-frameworks/r.section.md#torch-with-gpu-acceleration-torch-gpu-accel
question is, do we need nixglhost even on NixOs ? some first testing seems to indicate that yes I'll try run some tests on sunday | 09:53:37 |