!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
9 Dec 2024
@hexa:lossy.networkhexaok, apparently not15:44:08
@hexa:lossy.networkhexanvidia_uvm doesn't get loaded at boot anymore15:44:25
@hexa:lossy.networkhexa * it seems like nvidia_uvm doesn't get loaded at boot anymore 15:44:31
@ss:someonex.netSomeoneSerge (back on matrix)https://github.com/NixOS/nixpkgs/issues/33418015:52:57
@hexa:lossy.networkhexa
fbdcdde Kiskae             2024-05-22 13:46 +0200 308│             # Don't add `nvidia-uvm` to `kernelModules`, because we want
fbdcdde Kiskae             2024-05-22 13:46 +0200 309│             # `nvidia-uvm` be loaded only after `udev` rules for `nvidia` kernel
fbdcdde Kiskae             2024-05-22 13:46 +0200 310│             # module are applied.
fbdcdde Kiskae             2024-05-22 13:46 +0200 311│             #
fbdcdde Kiskae             2024-05-22 13:46 +0200 312│             # Instead, we use `softdep` to lazily load `nvidia-uvm` kernel module
fbdcdde Kiskae             2024-05-22 13:46 +0200 313│             # after `nvidia` kernel module is loaded and `udev` rules are applied.
fbdcdde Kiskae             2024-05-22 13:46 +0200 314│             extraModprobeConfig = ''
fbdcdde Kiskae             2024-05-22 13:46 +0200 315│               softdep nvidia post: nvidia-uvm
fbdcdde Kiskae             2024-05-22 13:46 +0200 316│             '';
16:03:03
@ss:someonex.netSomeoneSerge (back on matrix)Yeah, somehow softdep breaks with the open driver?16:03:59
@hexa:lossy.networkhexanope, reformat16:03:59
@hexa:lossy.networkhexaok, has been there since 2023.1116:04:16
@hexa:lossy.networkhexaand yeah, I'm on the open driver16:04:19
@ss:someonex.netSomeoneSerge (back on matrix)https://github.com/NixOS/nixpkgs/issues/334180#issuecomment-228451881616:04:30
@ss:someonex.netSomeoneSerge (back on matrix)atry16:04:31
@ss:someonex.netSomeoneSerge (back on matrix) * Atry16:04:33
@hexa:lossy.networkhexawe should probably deduplicate all issues to this one16:05:04
11 Dec 2024
@magic_rb:matrix.redalder.org@magic_rb:matrix.redalder.org joined the room.00:50:41
@magic_rb:matrix.redalder.org@magic_rb:matrix.redalder.org

cross post from #dev:nixos.org

anyone touch the nvidia driver code? packaging i mean. The long standing bug of "use xrandr twice and you get a segfault in X11" doesn't happen to me anymore apparently

00:51:08
@hexa:lossy.networkhexa

copying path '/nix/store/62vk99s9kdcjj4x64wcw22a7rwbfnm36-python3.12-onnxruntime-1.20.1' from 'ssh://hexa@build2.darmstadt.ccc.de'

20:11:21
@hexa:lossy.networkhexa🥳20:11:23
@hexa:lossy.networkhexanow if only the cuda build was working20:11:35
@hexa:lossy.networkhexahttps://github.com/microsoft/onnxruntime/issues/22855#issue-266288204720:58:10
@hexa:lossy.networkhexaah, very coool 🫠20:58:14
12 Dec 2024
@connorbaker:matrix.orgconnor (he/him) Ah right that thing
I was told CMake is supposed to understand it’s a header-only library and not try to actually link against a shared object file, so not sure why it’s doing exactly that and causing the build to fail
01:48:43
@connorbaker:matrix.orgconnor (he/him)Ah I need to actually respond to the issue after I try their suggestion01:50:04
@connorbaker:matrix.orgconnor (he/him)How is there so little time in a day 😭01:50:57
@connorbaker:matrix.orgconnor (he/him)ugh, my main desktop is gone from my tailscale network and unreachable from the other hosts. Damn Intel chip probably did a voltage whoopsy and needs to be hard reset.04:05:37
@connorbaker:matrix.orgconnor (he/him)Cool that shit is gonna be broken until I can fly home at the end of the month lmao04:42:54
@msanft:matrix.orgMoritz Sanft If I'm not mistaken, this brings in a dependency of nvidia-x11 upon all of the packages listed here, right?
https://github.com/NixOS/nixpkgs/blob/eac1633a086e8e109e00ce58c0b47721da1dbdfd/pkgs/os-specific/linux/nvidia-x11/generic.nix#L112

I wondered why perl is included in my closure, and thus discovered the mesa dependency of the driver, then got confused. If this is really the case, this is something you wouldn't want on a space-constrained system, right?
15:20:27
13 Dec 2024
@sielicki:matrix.orgsielickihey connor -- I can't request a github review on https://github.com/aws/aws-ofi-nccl/pull/745 for some reason, but I would like a review from you (or anyone else idling here) if you can find time. It's using the external cuda flake. 04:07:57
@sielicki:matrix.orgsielickiI have a very thorough libfabric derivation on my personal laptop that I intend to finish over the holidays and propose to nixpkgs proper04:08:36
@sielicki:matrix.orgsielicki * hey connor -- I can't request a github review on https://github.com/aws/aws-ofi-nccl/pull/745 for some reason, but I would like a review from you (or anyone else idling here) if you can find time. Specifically asking you because it's using your external cuda flake. 04:09:30
@sielicki:matrix.orgsielickiwant to call it out here (mostly as a reminder to myself): the openmpi drv in nixpkgs should be capable of completely detaching from ucx/ucc, but it's not simple to make that happen today. On aws you shouldn't need ucx/ucc at all.04:13:37

Show newer messages


Back to Room ListRoom Version: 9