!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

309 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda60 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
3 Oct 2024
@justbrowsing:matrix.orgKevin Mittman (UTC-7)
  • TRT 8.x depends on cuDNN 8.x (last release was 8.9.7)
  • TRT 10.x has optional support for cuDNN (not updated for 9.x)
  • The DL frameworks container image is more generic
17:09:43
4 Oct 2024
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) I know that when packaging TRT (any version) for Nixpkgs, it autopatchelf flags a dependency on cuDNN, so we need to link against it.
Does TRT 10.x not work with cuDNN 9.x at all, or is it not an officially supported combination?
onnxruntime, for example, says for CUDA 11.8 to use TRT 10.x with cuDNN 8.9.x, and with CUDA 12.x to use TRT 10.x with cuDNN 9.x. The latter combination wasn’t in the support matrix, so I was surprised.
For the DL frameworks container, does that mean TRT comes without support for cuDNN since it’s not an 8.9.x release, that it’s not officially supported (per the TRT support matrix), or something else?
15:21:59
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)* I know that when packaging TRT (any version) for Nixpkgs, autopatchelf flags a dependency on cuDNN, so we need to link against it.
Does TRT 10.x not work with cuDNN 9.x at all, or is it not an officially supported combination?
onnxruntime (not an NVIDIA product but a large use case for TRT), for example, says for CUDA 11.8 to use TRT 10.x with cuDNN 8.9.x, and with CUDA 12.x to use TRT 10.x with cuDNN 9.x. The latter combination wasn’t in the support matrix, so I was surprised.
For the DL frameworks container, does that mean TRT comes without support for cuDNN since it’s not an 8.9.x release, that it’s not officially supported (per the TRT support matrix), or something else?15:22:49
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I don’t have an easy way to test all the code paths different libraries could use to call into TRT or know which parts are accelerated with cuDNN but I am trying to make sure the latest working version of cuDNN is supplied to TRT in Nixpkgs :/15:24:20
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Jetson, for example, doesn’t have a cuDNN 8.9.6 release anywhere I can find (tarball or Debian installer), but that’s the version TRT has in the support matrix for the platform, so I’ve been using 8.9.5 (which does has a tarball for Jetson).15:25:44
5 Oct 2024
@glepage:matrix.orgGaétan Lepage Hey guys !
I noticed that the CUDA features of tinygrad were not working on non-NixOS linux systems.
10:22:50
@glepage:matrix.orgGaétan Lepage More precisely it can't find the libnvrtc.so lib. 10:23:41
@glepage:matrix.orgGaétan LepageDo I need to run it using nixGL ?10:23:49
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @glepage:matrix.org
Do I need to run it using nixGL ?
No, just add ${getLib cuda_nvrtc}/lib to the search path
11:01:19
@ss:someonex.netSomeoneSerge (matrix works sometimes)I'm en route, will reply to stuff on tuesday11:02:56
@glepage:matrix.orgGaétan LepageOk, I will try that thanks !11:31:16
7 Oct 2024
@ironbound:hackerspace.pl@ironbound:hackerspace.pl left the room.13:26:32
@glepage:matrix.orgGaétan Lepage

Hi,
I think that tinygrad is missing some libraries because I can get it to crash at runtime with:

Error processing prompt: Nvrtc Error 6, NVRTC_ERROR_COMPILATION
<null>(3): catastrophic error: cannot open source file "cuda_fp16.h"
  #include <cuda_fp16.h>
20:24:15
@glepage:matrix.orgGaétan Lepage Currently, we already patch the path to libnvrtc.so and libcuda.so, but maybe we should make the headers available too. 20:26:24
@glepage:matrix.orgGaétan Lepage * Currently, we already patch the path to libnvrtc.so and libcuda.so, but maybe we should make the headers available too. 20:35:11
@aidalgol:matrix.orgaidalgol What is that doing that a missing header is a runtime error? 20:44:05
@glepage:matrix.orgGaétan Lepage I think that tinygrad is compiling cuda kernels at runtime. 20:46:22
@glepage:matrix.orgGaétan Lepage That's why this missing kernel causes a crash when using the library.
tinygrad is entirely written in python and is thus itself not compiled at all.
20:46:55

Show newer messages


Back to Room ListRoom Version: 9