NixOS CUDA - Public Room Timeline

	NixOS CUDA	282 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

Load older messages

Sender	Message	Time
15 Dec 2025
hexa (UTC+1)	`❯ curl https://cache.nixos-cuda.org/mhf691zwwjrqi8b6an14pblyqbzwn1v2.narinfo missed hash⏎`	02:55:27
pdealbera	Thanks! Not the same thing, I can't reach the host: `❯ curl https://cache.nixos-cuda.org/mhf691zwwjrqi8b6an14pblyqbzwn1v2.narinfo curl: (7) Failed to connect to cache.nixos-cuda.org port 443 after 675 ms: Could not connect to server`	02:59:52
pdealbera	But that means its probably a thing on my end.	03:00:06
hexa (UTC+1)	the server is hosted in helsinki at hetzner fwiw	03:01:23
connor (burnt/out) (UTC-8)	Slightly off topic but for those of you who use Hydra or nix-eval-jobs with lots of eval time fetchers or substitution, you may be interested in some WIP I’ve been doing to improve that use case https://gist.github.com/ConnorBaker/9e31d3b08ff6d4ac841928412131fe15	09:42:32
connor (burnt/out) (UTC-8)	Download Numbers from doing a shallow eval (not forcing recursion) of Haskell.nix’s hydraJobs which has a number of flake inputs (and I think also does IFD?)	09:46:39
connor (burnt/out) (UTC-8)	Download I’m also trying to look into using Intel VTune to get a better idea of Nix bottlenecks/areas for improvement VTune is currently packaged in Nixpkgs through the Intel-oneapi stuff but I couldn’t get it working without using the latest version. I’ll probably try upstreaming the changes at some point unless someone beats me to it.	09:48:44
yorik.sar	Did you by any chance run a comparison for more common use-case of evaluating a sizeable NixOS config, for example? Just to see what those locks do to less parallel workload.	10:53:30
yorik.sar	I’m surprised to see parser there - how much code were you evaluating?	10:54:09
yorik.sar	I think I already saw some lock implementation in Nix code, probably better to reuse that one. Also, Nix code seems to prefer RAII (smth like `{ auto _thelock = lock.get(); … }`) rather than passing continuation to a function (`withLock(…)`).	10:56:46
yorik.sar	I'd like to do further work to deduplicate queries for .narinfo and the like, since Nix already generates quite the network storm by firing them off in serial. I wonder if Nix uses HTTP/2 there. I think with stream multiplexing, all requests could essentially fit in one pack of packets.	10:59:07
yorik.sar	* I'd like to do further work to deduplicate queries for .narinfo and the like, since Nix already generates quite the network storm by firing them off in serial. I wonder if Nix uses HTTP/2 there. I think with stream multiplexing, all requests could essentially fit in one pack of packets.	10:59:14
Ari Lotter	okay - i think i just figured out why cudnn/torch/nvrtc is broken.. cudnn does seem to require NVRTC at runtime - see https://docs.nvidia.com/deeplearning/cudnn/backend/latest/api/cudnn-graph-library.html for `CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING` - A runtime library required by cuDNN cannot be found in the predefined search paths. These libraries are libcuda.so (nvcuda.dll) and libnvrtc.so but it looks like nvrtc is not provided to the cudnn package in nixpkgs!!! https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/cuda-modules/packages/cudnn.nix so, unless i'm missing something, don't we just need to include nvrtc in buildInputs for cudnn, and this will fix the weird auto-runpath thing..?	17:32:01
Ari Lotter	(didn't wanna dump this in the gh issue in case im totally mistaken heh)	17:36:42
Ari Lotter	er or maybe propagatedBuildInputs	17:37:54
Ari Lotter	oh no it's way worse lol	20:06:25
Ari Lotter	because dlopen doesn't check the runpath of libtorch_cuda.so does it	20:06:36
Ari Lotter	since dlopen ignores the caller's runpath	20:06:48
Ari Lotter	it only checks the runpath of the main executable..	20:06:55
Ari Lotter	ok i have no idea what im doing <3 giving up	20:11:43
Ari Lotter	have managed to make it work by setting LD_LIBRARY_PATH but uhh seems pretty.. unusable	20:13:36
Ari Lotter	i think we'd have to patch the `dlopen` call	20:13:40
Ari Lotter	inside cudnn..?	20:13:45
Ari Lotter	https://github.com/NixOS/nixpkgs/issues/461334#issuecomment-3657416223	20:19:51
Ari Lotter	i think it's pretty hopeless. lol	20:19:58
connor (burnt/out) (UTC-8)	I’ve not; I suspect zero speedup since evaluating a single config doesn’t have opportunities for parallelism and all the locking is done through kernel structures so it should be super low overhead (Also Nixpkgs basically forbids eval time fetchers so I don’t think a config using just Nixpkgs would show a speedup)	20:39:18
connor (burnt/out) (UTC-8)	That was evaluating all of the closures attribute in nixos/release.nix with `nix eval --json`	20:40:10
connor (burnt/out) (UTC-8)	There is some, but it blocks forever, doesn’t do cleanup, and lacks a few other features needed for different types of builtin fetchers and substitutions	20:41:22
connor (burnt/out) (UTC-8)	Robbie Buxton have you run into this (or seen anyone run into it)? You’re one of the like three people I know who use PyTorch from Nixpkgs other than myself	20:42:49
Robbie Buxton	In reply to @connorbaker:matrix.org Robbie Buxton have you run into this (or seen anyone run into it)? You’re one of the like three people I know who use PyTorch from Nixpkgs other than myself I haven’t run into that but I have run into libcuda failing on trying to dlopen things it claims to not depend on	21:07:52

Show newer messages

Back to Room ListRoom Version: 9