!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

280 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
22 Sep 2025
@glepage:matrix.orgGaétan LepageTo work on the Azure stuff?09:16:35
@glepage:matrix.orgGaétan Lepage I have gained access to a beefy builder since then, so I'm less constrained than before on the x86_64-linux side of things. 09:17:08
@matthewcroughan:defenestrate.itmatthewcroughan SomeoneSerge (back on matrix): can you bump or have you bumped privately https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/opensfm/default.nix#L144 ? 20:19:03
@matthewcroughan:defenestrate.itmatthewcroughanI need it for https://github.com/NixOS/nixpkgs/pull/44200320:19:15
@matthewcroughan:defenestrate.itmatthewcroughanbecause none of the SfM tools work properly in nixpkgs anymore20:19:27
@matthewcroughan:defenestrate.itmatthewcroughanSfM tools are tools that take a dir of image and tag it with inferred data like rotation, coordinates etc, to pipe into gaussian splat utils like brush20:19:52
@matthewcroughan:defenestrate.itmatthewcroughanhttps://github.com/NixOS/nixpkgs/pull/43867220:43:43
@matthewcroughan:defenestrate.itmatthewcroughanhmm, this gets close to fixing colmap20:43:50
23 Sep 2025
@ss:someonex.netSomeoneSerge (back on matrix) matthewcroughan: hard pressed right now, maybe end of week... 00:30:01
@hexa:lossy.networkhexadid anyone here make substitions from the flox cache work yet?01:51:21
@glepage:matrix.orgGaétan LepageI had some cache hits is this is your question08:23:23
@glepage:matrix.orgGaétan Lepage FYI Zowoq has restored the cuda jobset on nix-community:
https://matrix.to/#/%21PbtOpdWBSRFbEZRLIf%3Anumtide.com/%240hueN5_QPZEhj5g4nqSa-gFgmCYe3CMKlG79bd2E-nM?via=blad.is&via=matrix.org&via=envs.net
08:24:47
@hugo:okeso.euHugo

@ss:someonex.net Thanks for your feedback on my PR adding Cuda tests.

I implemented most, but cannot test because I cannot rebuild xformers with CUDA enabled 🫤. I tried many times on multiple machines but no luck. It builds fine without Cuda though.

10:19:57
@hugo:okeso.euHugoRedacted or Malformed Event10:20:52
@hugo:okeso.euHugoScreenshot_20250923_122127.png
Download Screenshot_20250923_122127.png
10:22:03
@hugo:okeso.euHugoScreenshot_20250923_122117.png
Download Screenshot_20250923_122117.png
10:22:24
@hugo:okeso.euHugoI also find it really weird that CPU stays at 100% while the build is stuck.10:22:25
@hugo:okeso.euHugoIs that a known bug in Nix or in the build tools?10:22:55
@a-kenji:matrix.orgkenji changed their display name from a-kenji to kenji.10:42:31
@albertlarsan68:albertlarsan.frAlbert Larsannvcc spawns multiple compiler instances per invocation, and ninja spawns as many nvcc instances as the number of cores/threads, which makes the CPU overcommited (ex: you have 16 threads, ninja spawn 16 nvcc instances, and each one of the nvcc intances spawns 6 cicc instances, and each cicc instance consumes one full cpu thread. So you end up with 16*6=96 processes trying to run at the same time). The build is not stuck, it just takes a very long time to happen (because it tries to do more at the same time than your computer can handle)11:32:01
@hugo:okeso.euHugoTanks for this explanation. My impression is more that the system goes OOM and then something gets stuck and never resumes.11:44:59
@hugo:okeso.euHugoEspecially since a working build (previous release) finished in 14 minutes.12:08:09
@gregorburger:matrix.orgGregor BurgerHi Guys, quick question is there an equivalent cudaPackages.backendStdenv for clang? 12:09:25
@gregorburger:matrix.orgGregor Burger* Hi, quick question is there an equivalent cudaPackages.backendStdenv for clang? 12:11:37
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)It looks like you’re out of memory and swapping hard; try lowering the number of cores given to the job or the number of parallel instances NVCC runs with (per what Albert said above). I’ve had to enable ZRAM (which has been highly effective) for some builds even on my desktops with 96GB of RAM.12:53:57
@hugo:okeso.euHugo

I have this policy on my server, and a similar one on my desktop. Should that not prevent the OS from swapping?

  systemd.services.nix-daemon.serviceConfig = {
    CPUAccounting = true;
    AllowedCPUs = "2-15";
    MemoryAccounting = true;
    MemoryHigh = "48G";
    MemoryMax = "56G";
  };
12:55:49
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Currently no; I’m working on making the setup hooks and everything generic for Clang in https://github.com/NixOS/nixpkgs/pull/437723 but I ran into issues doing that and it’s not a high priority for me in the scope of that PR. Any particular use case?12:56:18
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I don’t know enough about systemd to answer that, but I know some of the flash attention kernel builds consume at least a hundred GB of RAM and if you’re seeing the build stall that reminds me of the swapping to disk behavior I’d seen previously. (I may also have misinterpreted the BTOP screen shot.)12:58:25
@hugo:okeso.euHugo Apparently there is an extra setting MemorySwapMax = "0"; that can disallow swap for a systemd unit. 13:37:42
@gregorburger:matrix.orgGregor BurgerWe would like to compile our codebase both in gcc and clang to get a broader coverage of warnings and errors.14:33:43

Show newer messages


Back to Room ListRoom Version: 9