NixOS CUDA - Public Room Timeline

	NixOS CUDA	290 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

Load older messages

Sender	Message	Time
26 Sep 2024
connor (he/him)	In reply to @glepage:matrix.org https://github.com/triton-lang/triton/issues/3535 Well that’s an infuriating read	16:33:18
Gaétan Lepage	It's OK, OpenAI is just a small startup with only a few people. And deep learning is not even their main activity	17:07:38
connor (he/him)	Yeah and they're ~~definitely not a for-profit organization~~	17:20:14
@adam:robins.wtf	"open" is in their name	17:24:26
nim65s	it's such a joke that I find it sad it was not opened one day earlier	17:28:20
Gaétan Lepage	"I propose a 200€ bounty for this PR. Please `git tag` the freaking commit.	21:09:04
Gaétan Lepage	* "I propose a 200€ bounty for this PR. Please `git tag` the freaking commit."	21:09:07
Gaétan Lepage	The ease of spinning up a release is a decreasing function of the project/company resources.	21:09:40
nim65s	same issue on a one-man project abandonned for the last year or so: https://github.com/bab2min/EigenRand/issues/56	21:47:05
nim65s	* same issue on a one-man project abandonned for the last year or so: https://github.com/bab2min/EigenRand/issues/56 : <48h	21:49:56
28 Sep 2024
	shekhinah changed their profile picture.	07:04:58
	kaya 𖤐 changed their profile picture.	16:55:46
1 Oct 2024
	-_o joined the room.	21:00:15
2 Oct 2024
hexa	Gaétan Lepage: please take care of tensordict	00:25:19
hexa	Download image.png	00:25:22
Gaétan Lepage	Sure, I will have a look right now. I have not faced any failure on my end, weird...	06:21:33
Gaétan Lepage	Is this on staging ?	06:23:26
Gaétan Lepage	All failures that I was able to find on hydra are timeouts or upstream dependency failures. I was able to build `tensordict` on all architectures...	07:05:50
hexa	this is on trunk	11:03:39
hexa	then you probably need to increase meta.timeout	11:04:00
Gaétan Lepage	Now that you say it, I remember this package being stuck (indefinitly) during mass rebuilds. I don't know if increasing the timeout will help. When everything works fine, it builds in ~1min... Also, nothing has changed in the derivation for the past few months.	11:47:12
Kevin Mittman (UTC-8)	Back from vacation	18:23:19
Kevin Mittman (UTC-8)	Redacted or Malformed Event	18:32:05
Kevin Mittman (UTC-8)	In reply to @ss:someonex.net Kevin Mittman Hi! Do you know how dcgm uses cuda and why it has to link several versions? See libdcgm_cublas_proxy${cudaMajor}.so	18:34:06
Kevin Mittman (UTC-8)	In reply to @connorbaker:matrix.org Kevin Mittman: does NVIDIA happen to have JSON (or otherwise structured) versions of their dependency constraints for packages somewhere, or are the tables on the docs for each respective package the only source? I'm working on update scripts and I'd like to avoid the manual stage of "go look on the website, find the table (it may have moved), and encode the contents as a Nix expression" Not really ... wishlist for future. Which product / component is this?	18:46:04
Kevin Mittman (UTC-8)	SomeoneSerge (utc+3): seems like reply got stuck in a thread	18:46:54
3 Oct 2024
connor (he/him)	In reply to @justbrowsing:matrix.org Not really ... wishlist for future. Which product / component is this? That particular request was born out of frustration with TensorRT. Any idea why the support matrix for TensorRT says only CUDNN 8.9.7 is supported (https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html) but the 24.09 container is shipping it with CUDNN 9.4 (https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)?	05:21:35
Kevin Mittman (UTC-8)	TRT 8.x depends on cuDNN 8.x (last release was 8.9.7) TRT 10.x has optional support for cuDNN (not updated for 9.x) The DL frameworks container image is more generic	17:09:43
4 Oct 2024
connor (he/him)	I know that when packaging TRT (any version) for Nixpkgs, it autopatchelf flags a dependency on cuDNN, so we need to link against it. Does TRT 10.x not work with cuDNN 9.x at all, or is it not an officially supported combination? onnxruntime, for example, says for CUDA 11.8 to use TRT 10.x with cuDNN 8.9.x, and with CUDA 12.x to use TRT 10.x with cuDNN 9.x. The latter combination wasn’t in the support matrix, so I was surprised. For the DL frameworks container, does that mean TRT comes without support for cuDNN since it’s not an 8.9.x release, that it’s not officially supported (per the TRT support matrix), or something else?	15:21:59
connor (he/him)	* I know that when packaging TRT (any version) for Nixpkgs, autopatchelf flags a dependency on cuDNN, so we need to link against it. Does TRT 10.x not work with cuDNN 9.x at all, or is it not an officially supported combination? onnxruntime (not an NVIDIA product but a large use case for TRT), for example, says for CUDA 11.8 to use TRT 10.x with cuDNN 8.9.x, and with CUDA 12.x to use TRT 10.x with cuDNN 9.x. The latter combination wasn’t in the support matrix, so I was surprised. For the DL frameworks container, does that mean TRT comes without support for cuDNN since it’s not an 8.9.x release, that it’s not officially supported (per the TRT support matrix), or something else?	15:22:49

Show newer messages

Back to Room ListRoom Version: 9