!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

282 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
11 Jun 2025
@connorbaker:matrix.orgconnor (he/him)https://github.com/NixOS/nixpkgs/pull/411445 should be ready for review/merge16:37:09
@ss:someonex.netSomeoneSerge (back on matrix)Oof thanks for the cherry-picks, I dropped the ball again16:52:52
@connorbaker:matrix.orgconnor (he/him)No worries!16:53:01
@etndiaz:matrix.orgetndiaz joined the room.17:53:56
@connorbaker:matrix.orgconnor (he/him)https://github.com/NixOS/nixpkgs/pull/415902 is ready for review (I'll cherry-pick as soon as it's merged)18:11:58
12 Jun 2025
@connorbaker:matrix.orgconnor (he/him)Merged https://github.com/NixOS/nixpkgs/pull/415902 and cherry-picked into https://github.com/NixOS/nixpkgs/pull/41144517:40:56
15 Jun 2025
@kaya:catnip.eekaya 𖤐 changed their profile picture.12:26:18
16 Jun 2025
@ser:sergevictor.euser(ial) joined the room.16:09:51
17 Jun 2025
@ser:sergevictor.euser(ial)I have a Debian host with installed incus and Nvidia card. I want to have a nixos guest which will host docker which will be running OCIs which require GPU access. What packages do I need to install on those nixos?07:57:22
@connorbaker:matrix.orgconnor (he/him)I've not used such a setup so I can't say -- am I understanding correctly that you'd have a NixOS VM running docker containers, and you want to pass the GPU through the VM and then through to the container?14:58:41
@hexa:lossy.networkhexathe frequency with which I see the nix-community cache (not calling it out...) with a backend timeout is annoying22:00:08
@hexa:lossy.networkhexagiven that nix fails hard when a substituter is unavailable22:00:21
18 Jun 2025
@ser:sergevictor.euser(ial)
In reply to @connorbaker:matrix.org
I've not used such a setup so I can't say -- am I understanding correctly that you'd have a NixOS VM running docker containers, and you want to pass the GPU through the VM and then through to the container?
https://linuxcontainers.org/incus/docs/main/reference/devices_gpu/
03:41:17
@ser:sergevictor.euser(ial)As I want to have a few containers, I suppose SR-IOV or MIG devices would be the best, do you have any experience with them?03:43:02
@connorbaker:matrix.orgconnor (he/him)
In reply to @hexa:lossy.network
given that nix fails hard when a substituter is unavailable
It does? I gotta double check my config to see if I set something to work around that :/
13:51:13
@connorbaker:matrix.orgconnor (he/him)
In reply to @ser:sergevictor.eu
As I want to have a few containers, I suppose SR-IOV or MIG devices would be the best, do you have any experience with them?
I do not — please let me know what you find if you pursue it further!
13:52:09
19 Jun 2025
@glepage:matrix.orgGaétan Lepage SomeoneSerge (UTC+U[-12,12]), connor (he/him) (UTC-7)
FYI: someone is adding NVIDIA's warp library: https://github.com/NixOS/nixpkgs/pull/412838
11:48:21
@luke-skywalker:matrix.orgluke-skywalker when using hardware.nvidia-container-toolkit can I use the mounts option to not map glibc into the containers?! 19:12:48
20 Jun 2025
@connorbaker:matrix.orgconnor (he/him)
In reply to @glepage:matrix.org
SomeoneSerge (UTC+U[-12,12]), connor (he/him) (UTC-7)
FYI: someone is adding NVIDIA's warp library: https://github.com/NixOS/nixpkgs/pull/412838
I’ll try to take a look at it before this weekend, but I trust you to merge if it works or feels ready :)
13:35:37
@connorbaker:matrix.orgconnor (he/him)Looks like eval time for ‘release-cuda.nix` shot up from 7s to 25s on the PR branch I have which fixes package set leakage18:45:00
@glepage:matrix.orgGaétan LepageOk thanks! And good luck with the eval ;)19:11:37
22 Jun 2025
@ss:someonex.netSomeoneSerge (back on matrix) changed their display name from SomeoneSerge (UTC+U[-12,12]) to SomeoneSerge (Ever OOMed by Element).12:12:55
@niten:fudo.imNiten joined the room.16:55:22
23 Jun 2025
@longregen:matrix.orglon joined the room.08:55:01
@longregen:matrix.orglonHi! I have a question, would anybody be interested in a services.vllm module? I was working on running it as systemd service and hardening it and I'm happy with the result...08:57:13
@longregen:matrix.orglonDownload vllm.nix08:58:43
@longregen:matrix.orglon(I've never contributed to nixpkgs, so I'm not sure how high quality is this)08:59:15
@longregen:matrix.orglon

The interesting part is

      MemoryDenyWriteExecute = false; # Needed for CUDA/PyTorch JIT
      PrivateDevices = false; # Needed for GPU access
      RestrictAddressFamilies = ["AF_UNIX" "AF_INET" "AF_INET6" "AF_NETLINK"];
      DevicePolicy = "closed"; # Only allow the following devices, based on strace usage:
      DeviceAllow = lib.flatten [
        # Basic devices
        "/dev/null rw"
        "/dev/urandom r"
        "/dev/tty rw"

        # NVIDIA control devices
        "/dev/nvidiactl rw"
        "/dev/nvidia-modeset rw"
        "/dev/nvidia-uvm rw"
        "/dev/nvidia-uvm-tools rw"

        (builtins.map (i: "/dev/nvidia${builtins.toString i} rw") (lib.splitString " " cfg.cudaDevices))

        # NVIDIA capability devices
        "/dev/nvidia-caps/nvidia-cap1 r"
        "/dev/nvidia-caps/nvidia-cap2 r"
      ];
      ProtectKernelTunables = true;
      ProtectKernelModules = true;
      ProtectControlGroups = true;
      RestrictNamespaces = true;
      LockPersonality = true;
      RestrictRealtime = true;
      RestrictSUIDSGID = true;
      RemoveIPC = true;
      PrivateMounts = true;
      PrivateUsers = true;
      ProtectHostname = true;
      ProtectKernelLogs = true;
      ProtectClock = true;
      ProtectProc = "invisible";
      UMask = "0077";
      CapabilityBoundingSet = ["CAP_SYS_NICE"];
      AmbientCapabilities = ["CAP_SYS_NICE"];
09:06:10
@connorbaker:matrix.orgconnor (he/him)

Two things I've promised to look at today:

  1. Bumping the version of protobuf used by OpenCV, which hasn't been updated in a while (need to backport to 25.05 as well).
  2. Figuring out how to revert https://github.com/NixOS/nixpkgs/pull/414647 in a way that doesn't break consumers of OpenCV -- really don't want cudatoolkit propagated to all consumers of OpenCV.
17:30:42
24 Jun 2025
@connorbaker:matrix.orgconnor (he/him):L23:45:47

Show newer messages


Back to Room ListRoom Version: 9