!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

282 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
15 Dec 2025
@hexa:lossy.networkhexa (UTC+1)
❯ curl https://cache.nixos-cuda.org/mhf691zwwjrqi8b6an14pblyqbzwn1v2.narinfo
missed hash⏎
02:55:27
@pdealbera:matrix.orgpdealbera

Thanks! Not the same thing, I can't reach the host:

❯ curl https://cache.nixos-cuda.org/mhf691zwwjrqi8b6an14pblyqbzwn1v2.narinfo
curl: (7) Failed to connect to cache.nixos-cuda.org port 443 after 675 ms: Could not connect to server
02:59:52
@pdealbera:matrix.orgpdealberaBut that means its probably a thing on my end.03:00:06
@hexa:lossy.networkhexa (UTC+1)the server is hosted in helsinki at hetzner fwiw03:01:23
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Slightly off topic but for those of you who use Hydra or nix-eval-jobs with lots of eval time fetchers or substitution, you may be interested in some WIP I’ve been doing to improve that use case https://gist.github.com/ConnorBaker/9e31d3b08ff6d4ac841928412131fe1509:42:32
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Numbers from doing a shallow eval (not forcing recursion) of Haskell.nix’s hydraJobs which has a number of flake inputs (and I think also does IFD?)
Download Numbers from doing a shallow eval (not forcing recursion) of Haskell.nix’s hydraJobs which has a number of flake inputs (and I think also does IFD?)
09:46:39
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I’m also trying to look into using Intel VTune to get a better idea of Nix bottlenecks/areas for improvement
VTune is currently packaged in Nixpkgs through the Intel-oneapi stuff but I couldn’t get it working without using the latest version. I’ll probably try upstreaming the changes at some point unless someone beats me to it.
Download I’m also trying to look into using Intel VTune to get a better idea of Nix bottlenecks/areas for improvement VTune is currently packaged in Nixpkgs through the Intel-oneapi stuff but I couldn’t get it working without using the latest version. I’ll probably try upstreaming the changes at some point unless someone beats me to it.
09:48:44
@yorik.sar:matrix.orgyorik.sarDid you by any chance run a comparison for more common use-case of evaluating a sizeable NixOS config, for example? Just to see what those locks do to less parallel workload.10:53:30
@yorik.sar:matrix.orgyorik.sarI’m surprised to see parser there - how much code were you evaluating?10:54:09
@yorik.sar:matrix.orgyorik.sar I think I already saw some lock implementation in Nix code, probably better to reuse that one. Also, Nix code seems to prefer RAII (smth like { auto _thelock = lock.get(); … }) rather than passing continuation to a function (withLock(…)). 10:56:46
@yorik.sar:matrix.orgyorik.sar

I'd like to do further work to deduplicate queries for .narinfo and the like, since Nix already generates quite the network storm by firing them off in serial.
I wonder if Nix uses HTTP/2 there. I think with stream multiplexing, all requests could essentially fit in one pack of packets.

10:59:07
@yorik.sar:matrix.orgyorik.sar *

I'd like to do further work to deduplicate queries for .narinfo and the like, since Nix already generates quite the network storm by firing them off in serial.

I wonder if Nix uses HTTP/2 there. I think with stream multiplexing, all requests could essentially fit in one pack of packets.

10:59:14
@arilotter:matrix.orgAri Lotter

okay - i think i just figured out why cudnn/torch/nvrtc is broken..
cudnn does seem to require NVRTC at runtime - see https://docs.nvidia.com/deeplearning/cudnn/backend/latest/api/cudnn-graph-library.html for CUDNN_STATUS_NOT_SUPPORTED_RUNTIME_PREREQUISITE_MISSING -

A runtime library required by cuDNN cannot be found in the predefined search paths. These libraries are libcuda.so (nvcuda.dll) and libnvrtc.so

but it looks like nvrtc is not provided to the cudnn package in nixpkgs!!!
https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/cuda-modules/packages/cudnn.nix

so, unless i'm missing something, don't we just need to include nvrtc in buildInputs for cudnn, and this will fix the weird auto-runpath thing..?

17:32:01
@arilotter:matrix.orgAri Lotter(didn't wanna dump this in the gh issue in case im totally mistaken heh)17:36:42
@arilotter:matrix.orgAri Lotterer or maybe propagatedBuildInputs17:37:54
@arilotter:matrix.orgAri Lotteroh no it's way worse lol20:06:25
@arilotter:matrix.orgAri Lotterbecause dlopen doesn't check the runpath of libtorch_cuda.so does it20:06:36
@arilotter:matrix.orgAri Lottersince dlopen ignores the caller's runpath20:06:48
@arilotter:matrix.orgAri Lotterit only checks the runpath of the main executable..20:06:55
@arilotter:matrix.orgAri Lotterok i have no idea what im doing <3 giving up 20:11:43
@arilotter:matrix.orgAri Lotterhave managed to make it work by setting LD_LIBRARY_PATH but uhh seems pretty.. unusable20:13:36
@arilotter:matrix.orgAri Lotter i think we'd have to patch the dlopen call 20:13:40
@arilotter:matrix.orgAri Lotterinside cudnn..?20:13:45
@arilotter:matrix.orgAri Lotterhttps://github.com/NixOS/nixpkgs/issues/461334#issuecomment-365741622320:19:51
@arilotter:matrix.orgAri Lotteri think it's pretty hopeless. lol20:19:58
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I’ve not; I suspect zero speedup since evaluating a single config doesn’t have opportunities for parallelism and all the locking is done through kernel structures so it should be super low overhead (Also Nixpkgs basically forbids eval time fetchers so I don’t think a config using just Nixpkgs would show a speedup)20:39:18
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) That was evaluating all of the closures attribute in nixos/release.nix with nix eval --json 20:40:10
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)There is some, but it blocks forever, doesn’t do cleanup, and lacks a few other features needed for different types of builtin fetchers and substitutions20:41:22
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Robbie Buxton have you run into this (or seen anyone run into it)? You’re one of the like three people I know who use PyTorch from Nixpkgs other than myself 20:42:49
@sporeray:matrix.orgRobbie Buxton
In reply to @connorbaker:matrix.org
Robbie Buxton have you run into this (or seen anyone run into it)? You’re one of the like three people I know who use PyTorch from Nixpkgs other than myself
I haven’t run into that but I have run into libcuda failing on trying to dlopen things it claims to not depend on
21:07:52

Show newer messages


Back to Room ListRoom Version: 9