!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

287 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
28 Feb 2025
@box1:matrix.org@box1:matrix.org set a profile picture.06:23:11
@box1:matrix.org@box1:matrix.org changed their display name from sepiabrown to Suwon Park.06:23:32
@box1:matrix.org@box1:matrix.org removed their profile picture.06:29:08
@box1:matrix.org@box1:matrix.org removed their display name Suwon Park.06:30:58
@sepiabrown:matrix.orgSuwon Park set a profile picture.06:32:14
@hexa:lossy.networkhexa (UTC+1) *
, cudaSupport
, cudaPackages
, stdenv ? if cudaSupport then cudaPackages.backendStdenv else stdenv
06:50:42
@hexa:lossy.networkhexa (UTC+1) *
, cudaSupport
, cudaPackages
, stdenv ? if cudaSupport then cudaPackages.backendStdenv else stdenv
06:51:04
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @hexa:lossy.network
couldn't the effectiveStdenv pattern be reduced to
Recursion aside, the default value is going to be discarded because there is a stdenv in the parent scope :(
08:44:51
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @hexa:lossy.network
SomeoneSerge (UTC+U[-12,12]): in opencv, why cxxdev and not dev as the output name?
It contains the propagated build inputs referenced by the cmake module. It's not in dev, because withPackages sources dev...
08:46:15
@hexa:lossy.networkhexa (UTC+1)hm, changing to that yields the same derivations as before, but maybe that is because nothing currently relies on the cuda backendStdenv 🤔17:35:50
2 Mar 2025
@snektron:matrix.orgSnektronIn case anybody is interested: I got nsight compute working: https://github.com/Snektron/nixos-config/blob/main/packages/nsight-compute.nix last time I checked there were still issues running the one in nixpkgs and I don't think the related PR has changed in the meantime11:46:23
3 Mar 2025
@little_dude:matrix.orglittle_dude joined the room.12:41:18
@little_dude:matrix.orglittle_dude

Hello, I'm having trouble getting CUDA working on Nixos. I made a post on Discourse but I thought I'd ask here since it's specific to CUDA: https://discourse.nixos.org/t/ollama-cuda-driver-library-init-failure-3/61068/2

In short, I installed ollama, but ollama reports:

Unable to load cudart library /nix/store/lgmvgx3r1pbpd40crz2nnliakfxh19f8-nvidia-x11-570.124.04-6.12.17/lib/libcuda.so.570.124.04: cuda driver library init failure: 3

I guess error 3 corresponds to the cudaErrorInitializationError described here: https://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/html/group__CUDART__TYPES_g3f51e3575c2178246db0a94a430e0038.html but that doesn't help me much.

I think my issue is that is have CUDA 12.4 installed, when my GPU supports CUDA 12.8? Atlough I'm not certain what the CUDA version reported by nvidia-smi really mean (whether it is the only version supported by my GPU, or whether it's the maximum version). Would you have an idea about where to go from here?

12:50:21
@little_dude:matrix.orglittle_dude *

Hello, I'm having trouble getting CUDA working on Nixos. I made a post on Discourse but I thought I'd ask here since it's specific to CUDA: https://discourse.nixos.org/t/ollama-cuda-driver-library-init-failure-3/61068/2

In short, I installed ollama, but ollama reports:

Unable to load cudart library /nix/store/lgmvgx3r1pbpd40crz2nnliakfxh19f8-nvidia-x11-570.124.04-6.12.17/lib/libcuda.so.570.124.04: cuda driver library init failure: 3

I guess error 3 corresponds to the cudaErrorInitializationError described here but that doesn't help me much.

I think my issue is that is have CUDA 12.4 installed, when my GPU supports CUDA 12.8? Atlough I'm not certain what the CUDA version reported by nvidia-smi really mean (whether it is the only version supported by my GPU, or whether it's the maximum version). Would you have an idea about where to go from here?

12:50:49
@little_dude:matrix.orglittle_dudeAlso I hope it's ok to cross-post like this. Apologies if that seems pushy.12:51:36
@little_dude:matrix.orglittle_dude I also thought the error might be because I'm using the GPU to run Wayland, but I assume that just like CPUs, GPUs can run multiple workloads in parallel? Or does CUDA need to have exclusive access to the GPU? (I know these are very naive questions, I just never dealt with GPUs before) 12:54:37
@little_dude:matrix.orglittle_dude * I also thought the error might be because I'm using the GPU to run Wayland, but I assume that just like CPUs, GPUs can run multiple workloads? Or does CUDA need to have exclusive access to the GPU? (I know these are very naive questions, I just never dealt with GPUs before) 12:56:23
@ruroruro:matrix.orgruro connor (he/him) (UTC-8): I just noticed that pkgs/development/libraries/science/math/tensorrt/extension.nix is a thing. At first glance, this code seems dead to me (or at least I wasn't able to find a place where it is called from)? It seems that nowadays all of the TensorRT-related code lives in pkgs/development/cuda-modules. The last commit (excluding automated reformatting) that touched pkgs/development/libraries/science/math/tensorrt seems to be 8e800cedaf24f5ad9717463b809b0beef7677000 authored by you in 2023. That commit also removed pkgs/development/libraries/science/math/tensorrt/generic.nix. So I am guessing that you forgot to also delete the extension.nix? 13:43:50
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) ruro: yes, seems likely :l 16:39:14
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)

little_dude: it's fine to cross-post!

Sorry it's not working, my only suggestion would be to try running it with whatever flags Ollama needs to enable debugging and/or LD_DEBUG=libs to make sure it's finding and loading the correct libraries.

The version difference across CUDA driver version and CUDA library version is fine -- just means you can run CUDA libraries using up to and including 12.8.

The GPU definitely supports multiple workloads, so that shouldn't be a problem either.

I'm strapped for time so I probably won't be able to help debug or troubleshoot, but I think some other people in here use ollama, so they might be able to chime in.

16:46:44
4 Mar 2025
@stick:matrix.orgsticki have prepared a cudaPackages_12 update from 12.4 to 12.8 here: https://github.com/NixOS/nixpkgs/pull/386983 can you have a look? I also included a nixpkgs-review result - 229 marked as broken / 219 failed to build / 2455 packages built but I am having hard time figuring out which build failures are new and which were happening even before can you advise what is the best way how to proceed? please comment on github, i am not always following the discussion here 10:48:13
@stick:matrix.orgstickan ideal thing for me would be if someone indicated the list of packages that really need to have the build fixed before the merge happens and I would (try to) work on fixing these10:53:23
@stick:matrix.orgstick* an ideal thing for me would be if someone indicated the list of packages that really need to have the build fixed before the merge happens and I will (try to) work on fixing these10:53:33
@ss:someonex.netSomeoneSerge (back on matrix) In addition to Connor's suggestions, can you check what is the output when you run cudaPackages.saxpy? 11:26:55
@stick:matrix.orgstickMaybe the merge of this PR should happen shortly after the merge of ROCm update in #367695 to not do massive rebuilds two times?12:12:00
@stick:matrix.orgstick* Maybe the merge of this PR should happen shortly after the merge of ROCm update in #367695 to not do massive rebuilds two times? https://github.com/NixOS/nixpkgs/pull/36769512:12:17
5 Mar 2025
@angleangleside:matrix.orgasa set a profile picture.08:07:57
@angleangleside:matrix.orgasa changed their display name from Asa to asa.08:08:10
7 Mar 2025
@mdietrich:matrix.orgmdietrich joined the room.13:03:38
@mdietrich:matrix.orgmdietrich Hey all, first of all thank you for your work, last time I tried to use any cuda-related programs and services I had to give up because this joint effort had not been set up.
I am just wondering if I am doing something wrong when trying to set up llama-cpp and open-webui on my NixOS machine. I've set up the nix-community cache (and ollama with CUDA support installs fine in any case), but either enabling nixpkgs.config.cudaSupport or overwriting e.g. llama-cpp's package with `services.llama-cpp.package = pkgs.overwrite { config.cudaSupport = true; config.rocmSupport = false; }
13:12:26

Show newer messages


Back to Room ListRoom Version: 9