!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

251 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda46 Servers

Load older messages


SenderMessageTime
24 May 2025
@little_dude:matrix.orglittle_dude *

Hello, this was a long time ago, but I'm finally back to trying to run ollama :D

saxpy doesn't work. I used this flake:

{
  description = "CUDA saxpy test";
  inputs.nixpkgs.url = "nixpkgs";
  outputs =
    { self, nixpkgs }:
    {
      devShell.x86_64-linux =
        let
          pkgs = import nixpkgs {
            system = "x86_64-linux";
            config.allowUnfree = true; # Required for CUDA
          };
        in
        pkgs.mkShell {
          name = "cuda-saxpy-shell";
          buildInputs = [
            pkgs.cudaPackages.saxpy
            pkgs.cudaPackages.cudatoolkit
          ];
          shellHook = ''
            export CUDA_PATH=${pkgs.cudatoolkit}
            export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
            export EXTRA_CCFLAGS="-I/usr/include"
            # Should I set this?
            # export LD_LIBRARY_PATH=${pkgs.cudaPackages.cudatoolkit.lib}/lib:$LD_LIBRARY_PATH
          '';
        };
    };
}

I'm running into the same(?) initialization error I think (see the log file attached) for LD_DEBUG=libs saxpy.

The output of nvidia-smi:

[little-dude@system76-laptop:~/cuda-tests]$ nvidia-smi 
Sat May 24 11:08:06 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144                Driver Version: 570.144        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   46C    P0            590W /  115W |      12MiB /   8188MiB |     13%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            3706      G   ...me-shell-48.1/bin/gnome-shell          2MiB |
+-----------------------------------------------------------------------------------------+
09:09:31
26 May 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7) Was basically bed ridden with exhaustion this weekend, starting to come back to life
Should be able to review changes to the CUDA lib PR today
17:52:27
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)Anyone feel like bridging to irc?..22:30:01
27 May 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Aaaaand I’m not gonna get a chance02:21:35
@connorbaker:matrix.orgconnor (he/him) (UTC-7) SomeoneSerge (UTC+U[-12,12]): you good for our usual weekly call in ~12h? 02:22:06
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Also, just got a ticket to NixCon so let me know if you’re going and want to catch up02:22:30
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)
In reply to @connorbaker:matrix.org
SomeoneSerge (UTC+U[-12,12]): you good for our usual weekly call in ~12h?
Yes please!
09:11:44
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)
In reply to @connorbaker:matrix.org
Also, just got a ticket to NixCon so let me know if you’re going and want to catch up
Finally! You chose the most expensive nixcon in the eu I must say xD
09:12:45
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)But yes I plan to be there09:12:58
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)

Hmm the driver is loaded correctly from the impure location, but the error is rather unspecific 🤔

I 143      78774: calling init: /run/opengl-driver/lib/libcuda.so.1 
...
161 CUDA error at cudaMalloc(&xDevice, N * sizeof(float)): initialization error 
09:21:21
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)

# Should I set this?

No there's no need

09:21:57
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)Ah sorry, I saw at least one of the PRs (the runtime wrapper one) and was meaning to merge but then got confused by how it relates to the conversation in the issue09:25:55
@ereslibre:ereslibre.socialereslibrethanks a lot! 🙏11:46:46
@little_dude:matrix.orglittle_dudeYes :( Any suggestion to debug further?12:43:52
@hexa:lossy.networkhexa (UTC+1)https://hydra.nix-community.org/eval/44854120:34:35
@hexa:lossy.networkhexa (UTC+1)lots of jobs lost, can anyone look into this?20:35:56
@hexa:lossy.networkhexa (UTC+1)
       (stack trace truncated; use '--show-trace' to show the full, detailed trace)

       error: attribute 'cudaLib' missing
       at /nix/store/gzqv127zcha1gh0a3ib4k71mlw46nkyh-source/pkgs/top-level/release-cuda.nix:17:54:
           16|   lib = import ../../lib;
           17|   inherit (import ../development/cuda-modules/_cuda) cudaLib;
             |                                                      ^
           18| in\
20:36:36
@hexa:lossy.networkhexa (UTC+1) cc connor (he/him) (UTC-7) 20:37:04
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Will take a look shortly, looks like a fixup mixed that one23:05:38
28 May 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)I think https://github.com/NixOS/nixpkgs/pull/411574 should fix everything00:05:08
@connorbaker:matrix.orgconnor (he/him) (UTC-7)(I mean, everything broken by the rename, not like, everything related to CUDA)00:05:27
29 May 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Okay work was very busy so I didn’t get a chance to review your changes since last week to the db PR Serge, apologies03:53:28
@connorbaker:matrix.orgconnor (he/him) (UTC-7) Made, for some reason, the bizarre choice of staying up late to work on different approach than what nix-eval-jobs takes: https://github.com/ConnorBaker/nix/tree/feat/eval-drvs
Essentially abusing CoW fork to do parallel eval of derivations in an incremental fashion
09:14:29
@little_dude:matrix.orglittle_dude So fwiw setting hardware.nvidia.open = false; fixed the issue. 14:32:37
30 May 2025
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Got halfway through reviewing your PR today Serge, hopefully I can knock the rest out tomorrow if work's not too busy07:58:13
@priyanshu_:matrix.orgPriyanshu Pansari joined the room.12:08:22
@sporeray:matrix.orgRobbie Buxton joined the room.21:02:30
31 May 2025
@trofi:matrix.org@trofi:matrix.org left the room.13:47:01
@assert-inequality:matrix.org@assert-inequality:matrix.org left the room.19:33:16
2 Jun 2025
@deeok:matrix.orgmatrixrooms.info mod bot (does NOT read/send messages and/or invites; used for checking reported rooms) joined the room.18:39:44

Show newer messages


Back to Room ListRoom Version: 9