!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

288 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
7 Mar 2025
@mdietrich:matrix.orgmdietrich * Hey all, first of all thank you for your work, last time I tried to use any cuda-related programs and services I had to give up because this joint effort had not been set up.
I am just wondering if I am doing something wrong when trying to set up llama-cpp and open-webui on my NixOS machine. I've set up the nix-community cache (and ollama with CUDA support installs fine in any case), but neither enabling nixpkgs.config.cudaSupport or overwriting e.g. llama-cpp's package with `services.llama-cpp.package = pkgs.overwrite { config.cudaSupport = true; config.rocmSupport = false; }` just dowload and install the appropriate packages, but lead to extremely long build times. Are these packages (llama-cpp and open-webui, of which I think onnxruntime takes the longest) just not built in the community cache?
13:13:54
@ss:someonex.netSomeoneSerge (back on matrix)Let's see13:15:53
@ss:someonex.netSomeoneSerge (back on matrix)https://hydra.nix-community.org/job/nixpkgs/cuda/llama-cpp.x86_64-linux13:15:55
@ss:someonex.netSomeoneSerge (back on matrix)https://hydra.nix-community.org/job/nixpkgs/cuda/onnxruntime.x86_64-linux13:16:26
@ss:someonex.netSomeoneSerge (back on matrix) open-webui apparently wasn't added to the release-cuda.nix file yet: https://hydra.nix-community.org/job/nixpkgs/cuda/open-webui.x86_64-linuxd 13:17:10
@ss:someonex.netSomeoneSerge (back on matrix)As for onnxruntime and llama-cpp, let's compare the hashes in your llama-cpp and the one reported by hydra13:18:20
@mdietrich:matrix.orgmdietrich

I am on x86_64, nixos-unstable with flakes with an RTX 3060 Ti and following substituters:

        substituters = [
          "https://nix-community.cachix.org"
        ];
        trusted-public-keys = [
          "nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs="
        ];
13:19:29
@mdietrich:matrix.orgmdietrichThank you for your quick answer13:20:01
@mdietrich:matrix.orgmdietrichI think onnxruntime is a dependency of open-webui and not llama-cpp, open-webui itself probably (?) does not need cuda support itself13:20:45
@mdietrich:matrix.orgmdietrich services.llama-cpp.package has the value "«derivation /nix/store/dhqdwqp6akr6h6f1k3rz190m3syrv6iy-llama-cpp-4731.drv»" 13:23:06
@ss:someonex.netSomeoneSerge (back on matrix) Let's try nix path-info --override-input nixpkgs github:NixOS/nixpkgs/1d2fe0135f360c970aee1d57a53f816f3c9bddae --derivation .#nixosConfigurations.$(hostname).config.services.llama-cpp.package to make it comparable with https://hydra.nix-community.org/build/3552955#tabs-buildinputs 13:24:11
@ss:someonex.netSomeoneSerge (back on matrix)I'd maybe not focus on these concerns, the expert hours are arguably more expensive that rebuild costs13:25:27
@ss:someonex.netSomeoneSerge (back on matrix) (still pending =) 13:25:51
@mdietrich:matrix.orgmdietrich Wait a minute, I am slightly confused as llama-cpp seems to actually have cuda support now that I rebuilt a couple of minutes ago. It just does not use my GPU when running inference even though it reports it as visible and usable. Maybe a configuration mistake on my side (although I am using the default NixOS service). I'll look into open-webui and onnxruntime now... 13:27:31
@mdietrich:matrix.orgmdietrichYes, onnxruntime does recompile, as well as python3.12-torch-2.5.1. I'm checkin the hashes now...13:33:36
@mdietrich:matrix.orgmdietrich

I am definitely building onnxruntime myself even though I get:

> nix path-info --override-input nixpkgs github:NixOS/nixpkgs/9f41a78ead0fbe2197cd4c09b5628060456cd6e3 --derivation .\#nixosConfigurations.$(hostname).pkgs.onnxruntime
• Updated input 'nixpkgs':
    'github:nixos/nixpkgs/32fb99ba93fea2798be0e997ea331dd78167f814?narHash=sha256-ozoOtE2hGsqh4XkTJFsrTkNxkRgShxpQxDynaPZUGxk%3D' (2025-02-21)
  → 'github:NixOS/nixpkgs/9f41a78ead0fbe2197cd4c09b5628060456cd6e3?narHash=sha256-WWXRCTOWcKvtzqzVgBMON0/TWcFMyWq831HQUITE4rs%3D' (2025-02-21)
/nix/store/a22vqi9d0ndhlcy1yxw4m3ir4z7ckfrg-onnxruntime-1.20.1.drv

Which is the same hash as the hydra build store path

13:48:48
@mdietrich:matrix.orgmdietrichI get the same hash for pytorch locally and in hydra as well!13:56:11
@ss:someonex.netSomeoneSerge (back on matrix) And if you nix build --override-input nixpkgs github:NixOS/nixpkgs/9f41a78ead0fbe2197cd4c09b5628060456cd6e3 .#nixosConfigurations.$(hostname).pkgs.onnxruntime? 13:59:42
@mdietrich:matrix.orgmdietrichThen I'm building nccl and cudnn-frontend for some reason?14:15:15
@ss:someonex.netSomeoneSerge (back on matrix)Well this certainly shouldn't be happening if the hashes indeed match14:21:40
@ss:someonex.netSomeoneSerge (back on matrix)Which hydra eval did you refer to?14:22:00
@ss:someonex.netSomeoneSerge (back on matrix)* Which hydra eval did you refer to, can you link it?14:22:06
@mdietrich:matrix.orgmdietrich

Sorry, I am back now. It seems that my setup had complicated things: I was trying stuff on a laptop while the actual setup was on another host (with the actual GPU), but I did use the correct hostname for the workstation, which should (I mean, that is the whole point of Nix?) lead to the same build. (Both systems are x86_64) However I was also trying globally enable cudaSupport and package-overridden cudaSupport, which might have lead to me making a mistake, I don't know. All I can say is that

nix build --override-input nixpkgs github:NixOS/nixpkgs/9f41a78ead0fbe2197cd4c09b5628060456cd6e3 .#nixosConfigurations.$(hostname).pkgs.onnxruntime

now just downloads onnxruntime from the cache, which is the expected behaviour. I'm going to check without overridden input and pytorch again and then with the whole system.
Another question though: How would I override cudaSupport for a single package and its dependencies? Like llama-cpp is easy (llama-cpp.package = pkgs.override { cudaSupport = true; }) but open-webui itself does not have a cudaSupport option, but onnxruntime and pytorch do.

15:27:49
@mdietrich:matrix.orgmdietrich Building without the overridden nixpkgs input forces rebuild (I used just nix build .#nixosConfigurations.$(hostname).pkgs.onnxruntime) 15:29:00
@mdietrich:matrix.orgmdietrichFor earlier, I meant the derivation store path of https://hydra.nix-community.org/build/3297277#tabs-details15:39:13
@mdietrich:matrix.orgmdietrichOk, python3.12-torch-2.5.1 is fetched from the community cache again iff I override the nixpkgs input again to the same hash as in https://hydra.nix-community.org/build/3534138#tabs-buildinputs15:43:38
@mdietrich:matrix.orgmdietrich As in nix build --override-input nixpkgs github:NixOS/nixpkgs/e9b0ff70ddc61c42548501b0fafb86bb49cca858 .#nixosConfigurations.$(hostname).pkgs.python3Packages.pytorch 15:43:55
@mdietrich:matrix.orgmdietrichIf I don't, then it rebuilds15:44:23
@mdietrich:matrix.orgmdietrichDoes that mean I have to find the right commit in nixpkgs that is somewhere between the onnxruntime hydra build, pytorch hydra build and my system that lets me fetch all of them?15:45:26
@mdietrich:matrix.orgmdietrichOooorrr I just need to update my nixpkgs again. Weird, it was not that old, only a week or so. I guess that fixed it (apart from llama-cpp not actually using the GPU, but that is another issue). Thanks for the debugging help, I definitely learned new ways of debugging such problems here!15:52:49

Show newer messages


Back to Room ListRoom Version: 9