!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

211 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda42 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
16 Jul 2024
@hexa:lossy.networkhexa (UTC+1)that happend to me … never?11:32:43
@connorbaker:matrix.orgconnor (he/him) (UTC-7) What’s the mechanism which allows us to build the tests in passthru.tests? 13:27:48
@connorbaker:matrix.orgconnor (he/him) (UTC-7)

SomeoneSerge (UTC+3): I'm having a bit of trouble with GPU-access in the sandbox. In particular, I've enabled it in my NixOS config with

{
  programs.nix-required-mounts = {
    enable = true;
    presets.nvidia-gpu.enable = true;
    # TODO: Fix merging of presets
    # error: The option `programs.nix-required-mounts.allowedPatterns.nvidia-gpu.unsafeFollowSymlinks' has conflicting definition values:
    # - In `/nix/store/dk2rpyb6ndvfbf19bkb2plcz5y3k8i5v-source/nixos/modules/programs/nix-required-mounts.nix': false
    # - In `/nix/store/2ja2h1nd0z2bw56cl4bn37cb9d18hnzr-source/devices/nixos-desktop/hardware.nix': true
    # Use `lib.mkForce value` or `lib.mkDefault value` to change the priority on any of these definitions.
    # TODO: After enabling running into this error when trying to build a derivation which has requiredSystemFeatures = [ "cuda" ];
    # error:
    #   … while setting up the build environment
    #
    #   error: getting attributes of path '/nix/store/0p1qsszik7hwjddzmyhikq9ywr2ki69l-systemd-minimal-255.6/sbin/bin': No such file or directory
    # Perhaps this is due to the systemd directory in /run/opengl-driver/lib?
    allowedPatterns.nvidia-gpu.unsafeFollowSymlinks = lib.mkForce true;
  };
}

I'm trying to enable the GPU portion of CMake's CUDA test suite (https://github.com/ConnorBaker/nixpkgs/commit/543cf7d2ec330286ba566e6e1187e531d155c5d0), but failing. I thought following the symlinks when mounting would help because before that I was unable to access the ${addDriverRunpath.driverLink}/lib directory (multiple symlinks), but it now fails with

$ nix build --impure -L .#cudaPackages.cmake-cuda-tests.tests.withGpu
warning: Nix search path entry '/nix/var/nix/profiles/per-user/root/channels' does not exist, ignoring
warning: killing stray builder process 5264 ()...
error:
       … while setting up the build environment

       error: getting attributes of path '/nix/store/0p1qsszik7hwjddzmyhikq9ywr2ki69l-systemd-minimal-255.6/sbin/bin': No such file or directory

Any ideas?

17:48:11
@connorbaker:matrix.orgconnor (he/him) (UTC-7)

Ah, okay.
The think addDriverRunpath.driverLink links to is /run/opengl-driver. That is in turn a symlink, created by this: https://github.com/NixOS/nixpkgs/blob/c82d9d313d5107c6ad3a92fc7d20343f45fa5ace/nixos/modules/hardware/graphics.nix#L5-L8
That derivation isn't expose except as a path, used here:
https://github.com/NixOS/nixpkgs/blob/c82d9d313d5107c6ad3a92fc7d20343f45fa5ace/nixos/modules/hardware/graphics.nix#L112-L121
I updated my nixos config as follows, and it seems to work.

{
  programs.nix-required-mounts = {
    enable = true;
    presets.nvidia-gpu.enable = true;
    allowedPatterns.nvidia-gpu = {
      onFeatures = [
        "gpu"
        "nvidia-gpu"
        "opengl"
        "cuda"
      ];
      # It exposes these paths in the sandbox:
      paths =
        let
          inherit (pkgs.addOpenGLRunpath) driverLink;
          thingDriverLinkLinksTo =
            config.systemd.tmpfiles.settings.graphics-driver."/run/opengl-driver"."L+".argument;
        in
        [
          driverLink
          thingDriverLinkLinksTo
          "/dev/dri"
          "/dev/nvidia*"
        ];
    };
  };
}

Of course, that same process would need to be repeated for anything in there which is in turn a symlink (which is the purpose of unsafeFollowSymlinks, I suppose), but I'm not getting that odd systemd bin error any more.

18:11:02
@mkiefel:matrix.orgmkiefelHi! I trying to get an application to work with libGL on a Jetson Orin AGX. For context, I am trying to get a camera image from a device with libargus (which requires GL). I'm not on the latest from unstable; maybe that is the issue. I've already tried pre-loading various GL libs from the base image of Jetpack but to no avail. Does anybody have some pointers for me, please?19:43:27
@mkiefel:matrix.orgmkiefel* Hi! I trying to get an application to work with libGL on a Jetson Orin AGX (with Ubuntu as host linux). For context, I am trying to get a camera image from a device with libargus (which requires GL). I'm not on the latest from unstable; maybe that is the issue. I've already tried pre-loading various GL libs from the base image of Jetpack but to no avail. Does anybody have some pointers for me, please?19:44:24
@mkiefel:matrix.orgmkiefel
In reply to@mkiefel:matrix.org
Hi! I trying to get an application to work with libGL on a Jetson Orin AGX (with Ubuntu as host linux). For context, I am trying to get a camera image from a device with libargus (which requires GL). I'm not on the latest from unstable; maybe that is the issue. I've already tried pre-loading various GL libs from the base image of Jetpack but to no avail. Does anybody have some pointers for me, please?
Man, I got it. Somehow the wrong libEGL_nvidia.so got picked up. With the right one it works. This kept me busy this afternoon. :) In any case, thanks so much for the great work on the cuda packages! I really appreciate all the work that you folks put into this.
20:04:26
@ss:someonex.netSomeoneSerge (utc+3)
In reply to @connorbaker:matrix.org

Ah, okay.
The think addDriverRunpath.driverLink links to is /run/opengl-driver. That is in turn a symlink, created by this: https://github.com/NixOS/nixpkgs/blob/c82d9d313d5107c6ad3a92fc7d20343f45fa5ace/nixos/modules/hardware/graphics.nix#L5-L8
That derivation isn't expose except as a path, used here:
https://github.com/NixOS/nixpkgs/blob/c82d9d313d5107c6ad3a92fc7d20343f45fa5ace/nixos/modules/hardware/graphics.nix#L112-L121
I updated my nixos config as follows, and it seems to work.

{
  programs.nix-required-mounts = {
    enable = true;
    presets.nvidia-gpu.enable = true;
    allowedPatterns.nvidia-gpu = {
      onFeatures = [
        "gpu"
        "nvidia-gpu"
        "opengl"
        "cuda"
      ];
      # It exposes these paths in the sandbox:
      paths =
        let
          inherit (pkgs.addOpenGLRunpath) driverLink;
          thingDriverLinkLinksTo =
            config.systemd.tmpfiles.settings.graphics-driver."/run/opengl-driver"."L+".argument;
        in
        [
          driverLink
          thingDriverLinkLinksTo
          "/dev/dri"
          "/dev/nvidia*"
        ];
    };
  };
}

Of course, that same process would need to be repeated for anything in there which is in turn a symlink (which is the purpose of unsafeFollowSymlinks, I suppose), but I'm not getting that odd systemd bin error any more.

Answering from a phone, curt. That's the reason the module mounts the closure of hardware.opengl.package by default. If you used mkForce somewhere you.could've overridden that accidentally. The symlink branch is for non-nixos but I don't trust it. I was thinking maybe a runtime closure computation (nix-store --query --rewuisites) might be a reasonable future alternative. We'll have to come up with something stable anyway, for cdi
20:17:07
@ss:someonex.netSomeoneSerge (utc+3)The datacenter driver is also merged into hardware.opengl.package isn't it?20:18:10
@ss:someonex.netSomeoneSerge (utc+3) To be clear: the intention is that on nixos the user should manually list all packages in the driver's closure. If you find that you need to that's either a bug or an edge case I failed yo handle 20:19:42
@ss:someonex.netSomeoneSerge (utc+3)* To be clear: the intention is that on nixos the user shouldn't manually list all packages in the driver's closure. If you find that you need to that's either a bug or an edge case I failed yo handle20:19:53
@ss:someonex.netSomeoneSerge (utc+3)* To be clear: the intention is that on nixos the user should never have to manually list all packages in the driver's closure. If you find that you need to that's either a bug or an edge case I failed yo handle20:20:12
@ss:someonex.netSomeoneSerge (utc+3)* To be clear: the intention is that on nixos the user should never have to manually list all packages in the driver's closure. If you find that you need to that's either a bug or an edge case I failed to handle20:20:27
@ss:someonex.netSomeoneSerge (utc+3)
In reply to @mkiefel:matrix.org
Man, I got it. Somehow the wrong libEGL_nvidia.so got picked up. With the right one it works. This kept me busy this afternoon. :) In any case, thanks so much for the great work on the cuda packages! I really appreciate all the work that you folks put into this.
Thanks. Could you still tell us which libegl was the wrong one and which one is the right?
20:21:58
@mkiefel:matrix.orgmkiefel
In reply to@ss:someonex.net
Thanks. Could you still tell us which libegl was the wrong one and which one is the right?
Sure. It went for /nix/store/cg66ia01r8226nr478rv2b7fffvrl4gg-xgcc-12.3.0-libgcc/lib/libEGL_nvidia.so.0 but should have picked the one in /usr/lib/aarch64-linux-gnu/tegra-egl/libEGL_nvidia.so.0. I think I need to do something like nixGL and set these libraries up when calling the executable. I am still a bit confused why setting export __EGL_VENDOR_LIBRARY_FILENAMES=/usr/lib/aarch64-linux-gnu/tegra-egl/nvidia.json didn't do the trick.
20:31:42
@alex3829:matrix.org@alex3829:matrix.org left the room.23:17:07
17 Jul 2024
@ironbound:hackerspace.pl@ironbound:hackerspace.pl changed their display name from ironbound to Professor Bin Dong.18:22:01
@ironbound:hackerspace.pl@ironbound:hackerspace.pl removed their profile picture.18:22:09
@ironbound:hackerspace.pl@ironbound:hackerspace.pl set a profile picture.18:23:57
@philiptaron:matrix.orgPhilip Taron (UTC-8) joined the room.19:07:33
18 Jul 2024
@jiashuaixu:matrix.orgJesse joined the room.09:51:27

Show newer messages


Back to Room ListRoom Version: 9