!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

181 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda36 Servers

Load older messages


SenderMessageTime
29 Apr 2024
@vid:matrix.orgvid ok, I'm sorry to ask this, but do you see anything obviously wrong in my config? nvidia-smi does find the card, but none of the libraries/docker seem to work. /run/cdi/nvidia-container-toolkit.json exists but isn't populated 13:30:06
@vid:matrix.orgvidthere seems to be a fundamental problem when nvidia-container-toolkit is installed, every docker command yields "no help topic for" <cmd>13:38:56
@connorbaker:matrix.orgconnor (he/him) (UTC-5)@vid do you have an example container you’re trying to run? Looks close to my setup so I could give it a try14:30:37
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.15:29:37
@vid:matrix.orgvidit was just the stock llama.cpp repo, following the instructions for the docker light setup. it was probably something I was doing wrong, but after spending a weekend on this, I got it running on ubuntu without pulling out a single hair. I'm going to have to stick to that camp, but I will keep an eye on nixos 'cause I really like the ideas15:47:43
@trexd:matrix.orgtrexd

I found that doing docker run --gpus=all results in

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Whereas docker run --device nvidia.com/gpu=all will detect my GPU. vid

My minimal settings are documented in this issue. https://github.com/NixOS/nixpkgs/issues/305312

16:05:20
30 Apr 2024
@simon88812:matrix.orgPiqué joined the room.14:20:23
1 May 2024
@hacker1024:matrix.orghacker1024

Has anyone managed to get TensorFlow and PyTorch in the same Python environment on a recent nixos-unstable? This has seemed to have broken at some point in the last few months.

$ nix-shell -I nixpkgs=channel:nixos-unstable -p 'python3.withPackages (ps: with ps; [ torch tensorflow ])'

this derivation will be built:
  /nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv
building '/nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv'...
error: collision between `/nix/store/k0hpynrpwp0ihh86r1walxv0dcvij9ba-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc' and `/nix/store/n67kryc5dcblnvb17h04fx7ivbbjjlk6-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc'
error: builder for '/nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv' failed with exit code 255;
       last 1 log lines:
       > error: collision between `/nix/store/k0hpynrpwp0ihh86r1walxv0dcvij9ba-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc' and `/nix/store/n67kryc5dcblnvb17h04fx7ivbbjjlk6-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc'
       For full logs, run 'nix log /nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv'.
02:41:09
@hacker1024:matrix.orghacker1024 *

Has anyone managed to get TensorFlow and PyTorch in the same Python environment on a recent nixos-unstable?

$ nix-shell -I nixpkgs=channel:nixos-unstable -p 'python3.withPackages (ps: with ps; [ torch tensorflow ])'

this derivation will be built:
  /nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv
building '/nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv'...
error: collision between `/nix/store/k0hpynrpwp0ihh86r1walxv0dcvij9ba-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc' and `/nix/store/n67kryc5dcblnvb17h04fx7ivbbjjlk6-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc'
error: builder for '/nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv' failed with exit code 255;
       last 1 log lines:
       > error: collision between `/nix/store/k0hpynrpwp0ihh86r1walxv0dcvij9ba-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc' and `/nix/store/n67kryc5dcblnvb17h04fx7ivbbjjlk6-python3.11-grpcio-1.62.1/lib/python3.11/site-packages/grpc/__pycache__/__init__.cpython-311.opt-1.pyc'
       For full logs, run 'nix log /nix/store/p7hnwqgxp8hm52qkw787r9i9akb1y9fd-python3-3.11.9-env.drv'.
02:43:27
@connorbaker:matrix.orgconnor (he/him) (UTC-5)I haven't but that seems about right -- they either don't pin or pin different versions of dependencies :l04:44:10
@connorbaker:matrix.orgconnor (he/him) (UTC-5)

I think my ISP hates me running nixpkgs-review almost as much as I do

[82/70/5750 built, 135/283/6926 copied (2415.7/87849.0 MiB), 2042.0/25876.6 MiB DL] connecting to 'ssh-ng://nix@nixos-build01'
05:24:16
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.15:06:29
2 May 2024
@brandon:matrix.radiation.iobrandon joined the room.18:18:28
3 May 2024
@kaya:catnip.eekaya joined the room.14:01:31
@ironbound:hackerspace.plironbound changed their profile picture.17:47:48
4 May 2024
@ss:someonex.netSomeoneSerge (Way down Hadestown) changed their display name from SomeoneSerge (is taking time off and doesn't want to hear about it) to SomeoneSerge (Way down Hadestown).21:03:42
7 May 2024
@yklcs:matrix.orgyklcs joined the room.02:25:58
@yklcs:matrix.orgyklcs Hello, I was wondering whether cudaPackages.cudatoolkit with Nix would allow me to use multiple versions of CUDA on my machine, either with NixOS or by just using the Nix package manager. 02:31:16
@brandon:matrix.radiation.iobrandonyklcs: Cuda can be tricky but I've had good luck using nix shell and specific versions of cudatoolkit.04:58:33
@yklcs:matrix.orgyklcs
In reply to @brandon:matrix.radiation.io
yklcs: Cuda can be tricky but I've had good luck using nix shell and specific versions of cudatoolkit.
Thanks. Do you have any .nix files to share?
05:38:34
@ironbound:hackerspace.plironbound https://nixos.wiki/wiki/CUDA yklcs 10:06:58
@msanft:matrix.orgMoritz Sanft Hey, I tried switching from virtualisation.docker.enableNvidia = true; to the more recent virtualisation.containers.cdi.dynamic.nvidia.enable = true;, hardware.nvidia-container-toolkit.enable = true; and features.cdi = true;. I'm using Docker daemon and client at v25, and since switching to the new configuration options, I see the following when trying to start containers with GPUs:

`
`

Searching online for a little, most of the people running into that issue didn't install the CTK properly. However, that shouldn't be the case with the options mentioned above, or am I wrong? Does anyone of you have another idea?
14:57:05
@msanft:matrix.orgMoritz Sanft Hey, I tried switching from virtualisation.docker.enableNvidia = true; to the more recent virtualisation.containers.cdi.dynamic.nvidia.enable = true;, hardware.nvidia-container-toolkit.enable = true; and features.cdi = true;. I'm using Docker daemon and client at v25, and since switching to the new configuration options, I see the following when trying to start containers with GPUs:

May 07 14:50:04 nixos docker[2350]: docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Searching online for a little, most of the people running into that issue didn't install the CTK properly. However, that shouldn't be the case with the options mentioned above, or am I wrong? Does anyone of you have another idea?
14:57:12
@trexd:matrix.orgtrexd
In reply to @msanft:matrix.org
Hey, I tried switching from virtualisation.docker.enableNvidia = true; to the more recent virtualisation.containers.cdi.dynamic.nvidia.enable = true;, hardware.nvidia-container-toolkit.enable = true; and features.cdi = true;. I'm using Docker daemon and client at v25, and since switching to the new configuration options, I see the following when trying to start containers with GPUs:

May 07 14:50:04 nixos docker[2350]: docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Searching online for a little, most of the people running into that issue didn't install the CTK properly. However, that shouldn't be the case with the options mentioned above, or am I wrong? Does anyone of you have another idea?
Can you try my suggestion above?
15:24:31
@trexd:matrix.orgtrexd
In reply to @trexd:matrix.org

I found that doing docker run --gpus=all results in

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Whereas docker run --device nvidia.com/gpu=all will detect my GPU. vid

My minimal settings are documented in this issue. https://github.com/NixOS/nixpkgs/issues/305312

This one Moritz Sanft
15:24:43
@msanft:matrix.orgMoritz SanftOhh, that seems helpful! Will try!15:25:25
@msanft:matrix.orgMoritz SanftThat works. Thank you!15:34:30
8 May 2024
@nrs-status:matrix.orgthirdofmay18081814goya changed their display name from nrs-status to thirdofmay18081814goya.00:55:57
@nrs-status:matrix.orgthirdofmay18081814goya set a profile picture.00:56:09
@connorbaker:matrix.orgconnor (he/him) (UTC-5)I am re-emerging from the exhaustion surrounding travel and interviews; will be hammering the PR I have open into shape tomorrow; hopefully ready for review and merge soon so we can get new releases of CUDA, CUDN , etc.03:02:17

There are no newer messages yet.


Back to Room ListRoom Version: 9