!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

251 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda46 Servers

Load older messages


SenderMessageTime
17 Jun 2025
@hexa:lossy.networkhexa (UTC+1)given that nix fails hard when a substituter is unavailable22:00:21
18 Jun 2025
@ser:sergevictor.euser(ial)
In reply to @connorbaker:matrix.org
I've not used such a setup so I can't say -- am I understanding correctly that you'd have a NixOS VM running docker containers, and you want to pass the GPU through the VM and then through to the container?
https://linuxcontainers.org/incus/docs/main/reference/devices_gpu/
03:41:17
@ser:sergevictor.euser(ial)As I want to have a few containers, I suppose SR-IOV or MIG devices would be the best, do you have any experience with them?03:43:02
@connorbaker:matrix.orgconnor (he/him) (UTC-7)
In reply to @hexa:lossy.network
given that nix fails hard when a substituter is unavailable
It does? I gotta double check my config to see if I set something to work around that :/
13:51:13
@connorbaker:matrix.orgconnor (he/him) (UTC-7)
In reply to @ser:sergevictor.eu
As I want to have a few containers, I suppose SR-IOV or MIG devices would be the best, do you have any experience with them?
I do not — please let me know what you find if you pursue it further!
13:52:09
19 Jun 2025
@glepage:matrix.orgGaétan Lepage SomeoneSerge (UTC+U[-12,12]), connor (he/him) (UTC-7)
FYI: someone is adding NVIDIA's warp library: https://github.com/NixOS/nixpkgs/pull/412838
11:48:21
@luke-skywalker:matrix.orgluke-skywalker when using hardware.nvidia-container-toolkit can I use the mounts option to not map glibc into the containers?! 19:12:48
20 Jun 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)
In reply to @glepage:matrix.org
SomeoneSerge (UTC+U[-12,12]), connor (he/him) (UTC-7)
FYI: someone is adding NVIDIA's warp library: https://github.com/NixOS/nixpkgs/pull/412838
I’ll try to take a look at it before this weekend, but I trust you to merge if it works or feels ready :)
13:35:37
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Looks like eval time for ‘release-cuda.nix` shot up from 7s to 25s on the PR branch I have which fixes package set leakage18:45:00
@glepage:matrix.orgGaétan LepageOk thanks! And good luck with the eval ;)19:11:37
22 Jun 2025
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) changed their display name from SomeoneSerge (UTC+U[-12,12]) to SomeoneSerge (Ever OOMed by Element).12:12:55
@niten:fudo.imNiten joined the room.16:55:22
23 Jun 2025
@longregen:matrix.orglon joined the room.08:55:01
@longregen:matrix.orglonHi! I have a question, would anybody be interested in a services.vllm module? I was working on running it as systemd service and hardening it and I'm happy with the result...08:57:13
@longregen:matrix.orglonDownload vllm.nix08:58:43
@longregen:matrix.orglon(I've never contributed to nixpkgs, so I'm not sure how high quality is this)08:59:15
@longregen:matrix.orglon

The interesting part is

      MemoryDenyWriteExecute = false; # Needed for CUDA/PyTorch JIT
      PrivateDevices = false; # Needed for GPU access
      RestrictAddressFamilies = ["AF_UNIX" "AF_INET" "AF_INET6" "AF_NETLINK"];
      DevicePolicy = "closed"; # Only allow the following devices, based on strace usage:
      DeviceAllow = lib.flatten [
        # Basic devices
        "/dev/null rw"
        "/dev/urandom r"
        "/dev/tty rw"

        # NVIDIA control devices
        "/dev/nvidiactl rw"
        "/dev/nvidia-modeset rw"
        "/dev/nvidia-uvm rw"
        "/dev/nvidia-uvm-tools rw"

        (builtins.map (i: "/dev/nvidia${builtins.toString i} rw") (lib.splitString " " cfg.cudaDevices))

        # NVIDIA capability devices
        "/dev/nvidia-caps/nvidia-cap1 r"
        "/dev/nvidia-caps/nvidia-cap2 r"
      ];
      ProtectKernelTunables = true;
      ProtectKernelModules = true;
      ProtectControlGroups = true;
      RestrictNamespaces = true;
      LockPersonality = true;
      RestrictRealtime = true;
      RestrictSUIDSGID = true;
      RemoveIPC = true;
      PrivateMounts = true;
      PrivateUsers = true;
      ProtectHostname = true;
      ProtectKernelLogs = true;
      ProtectClock = true;
      ProtectProc = "invisible";
      UMask = "0077";
      CapabilityBoundingSet = ["CAP_SYS_NICE"];
      AmbientCapabilities = ["CAP_SYS_NICE"];
09:06:10
@connorbaker:matrix.orgconnor (he/him) (UTC-7)

Two things I've promised to look at today:

  1. Bumping the version of protobuf used by OpenCV, which hasn't been updated in a while (need to backport to 25.05 as well).
  2. Figuring out how to revert https://github.com/NixOS/nixpkgs/pull/414647 in a way that doesn't break consumers of OpenCV -- really don't want cudatoolkit propagated to all consumers of OpenCV.
17:30:42
24 Jun 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7):L23:45:47
@connorbaker:matrix.orgconnor (he/him) (UTC-7) https://github.com/NixOS/nixpkgs/blob/5d0aa4675f7a35ec9661325d1dc22dfcbba5d040/pkgs/development/python-modules/warp-lang/default.nix#L100 is wrong; there's no bsd license 23:45:58
@connorbaker:matrix.orgconnor (he/him) (UTC-7)https://github.com/NixOS/nixpkgs/pull/41972223:56:43
@hexa:lossy.networkhexa (UTC+1)proper meta-checks when 🙂 23:59:26
25 Jun 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)WIP other PR to fix the CUDA builds: https://github.com/NixOS/nixpkgs/pull/41975001:30:11
@glepage:matrix.orgGaétan LepageThanks for cathing this guys06:37:59
@connorbaker:matrix.orgconnor (he/him) (UTC-7) Okay that was a gigantic pain in the ass but I think that PR is all good to go now; added passthru.tests as well. 22:37:42
@indoor_squirrel:matrix.orgindoor_squirrel You go, Connor! We're all rooting for you! Thanks @ss:someonex.net: ! 22:41:36
@indoor_squirrel:matrix.orgindoor_squirrel* You go, Connor! We're all rooting for you! Thanks @ss:someonex.net and Gaétan Lepage!22:42:01
27 Jun 2025
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) connor (he/him) (UTC-7): can you test if https://gist.github.com/SomeoneSerge/75a8ec66917bc2dd8242e638a2c809f3 is sufficient to make nix run ...saxpy work without extra fuss? 13:17:19
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) * connor (he/him) (UTC-7): can you test if https://gist.github.com/SomeoneSerge/75a8ec66917bc2dd8242e638a2c809f3 is sufficient to make nix run ...saxpy work without extra fuss on an ubuntu/jetpack? 13:18:16
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)

I'm smh getting an error despite the cuda_compat driver:

Runtime version: 12080
Driver version: 12080
Host memory initialized, copying to the device
CUDA error at cudaMalloc(&xDevice, N * sizeof(float)): system has unsupported display driver / cuda driver combination
13:20:34

Show newer messages


Back to Room ListRoom Version: 9