NixOS CUDA - Public Room Timeline

	NixOS CUDA	290 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	58 Servers

Load older messages

Sender	Message	Time
10 Sep 2024
@adam:robins.wtf	* yes, NixOS. 3060Ti	23:52:05
@adam:robins.wtf	* yes, NixOS. 3060	23:52:13
@adam:robins.wtf	06:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)	23:52:25
11 Sep 2024
@adam:robins.wtf	results of my ollama testing are: 0.3.5 - works with cudaPackages 12_3 and 12_4 0.3.9 - works on 12_3, broken on 12_4 0.3.10 - works on 12_3, broken on 12_4	01:13:46
connor (burnt/out) (UTC-8)	It is surprising to me that 0.3.5 works with CUDA 12 at all; I guess there were no breaking API changes on stuff they relied on?	18:05:26
12 Sep 2024
connor (burnt/out) (UTC-8)	In reply to @connorbaker:matrix.org So... I really don't want to have to figure out testing and stuff for OpenCV for https://github.com/NixOS/nixpkgs/pull/339619. OpenCV 4.10 (we have 4.9) supports CUDA 12.4+. Maybe just updating it to punt the issue down the road is fine? (Our latest CUDA version right now is 12.4.) I started writing a `pkgs.testers` implementation for what Serge suggested here: https://matrix.to/#/!eWOErHSaiddIbsUNsJ:nixos.org/$phSCjT-mxTap-ccF98Z7hZakHk3_-jjkPw2fIvzBhjA?via=nixos.org&via=matrix.org&via=nixos.dev	00:32:04
connor (burnt/out) (UTC-8)	SomeoneSerge (nix.camp): as a short-term thing, are you okay with me patching out OpenCV's requirement that CUDA version match so we can merge the CUDA fix?	23:30:01
connor (burnt/out) (UTC-8)	I'm in the process of implementing a tester (https://github.com/NixOS/nixpkgs/pull/341471) but it's taking a bit and I'd like OpenCV fixed (or at least buildable) with CUDA, without breaking a bunch of downstream consumers of OpenCV (like FFMPEG)	23:35:35
13 Sep 2024
	kaya 𖤐 changed their profile picture.	07:16:41
SomeoneSerge (back on matrix)	Sorry my availability has been limited this way	10:19:52
SomeoneSerge (back on matrix)	* Sorry my availability has been limited this week	10:19:55
SomeoneSerge (back on matrix)	In reply to @connorbaker:matrix.org wouldn't things like API changes between versions cause breakage? EDIT: I guess they would cause build failures... my primary concern was that it would cause failures at runtime, but I suppose that's not really a problem for compiled targets. Relative to libc, NVIDIA's libraries change way, way more between releases (even minor versions!). Yeah it occurred to me right after posting that for the issue you're actually describing we need very different tests. What I proposed was basically ensuring that the expected versions of dependencies are loaded when running in isolation. What you actually wanted to ensure is that when a different version has already been loaded (which is guaranteed to happen with python) the runtime still works	10:22:00
SomeoneSerge (back on matrix)	In reply to @connorbaker:matrix.org SomeoneSerge (nix.camp): as a short-term thing, are you okay with me patching out OpenCV's requirement that CUDA version match so we can merge the CUDA fix? Sure let's try. I'd still check something something trivial like `# test1 import torch torch.randn(10, 10, device="cuda").sum().item() import cv2 # do something with cv2 and cuda # test2 import cv2 # do something with cv2 and cuda import torch torch.randn(10, 10, device="cuda").sum().item()`	10:24:51
connor (burnt/out) (UTC-8)	In reply to @ss:someonex.net Sorry my availability has been limited this week No need for apology; all volunteer time :)	16:50:50
connor (burnt/out) (UTC-8)	In reply to @ss:someonex.net Sure let's try. I'd still check something something trivial like `# test1 import torch torch.randn(10, 10, device="cuda").sum().item() import cv2 # do something with cv2 and cuda # test2 import cv2 # do something with cv2 and cuda import torch torch.randn(10, 10, device="cuda").sum().item()` Ooh that’s a good minimal test (hopefully), mind if I use that?	16:52:03
connor (burnt/out) (UTC-8)	To clarify SomeoneSerge (nix.camp), do you want a test like that in the OpenCV PR, or is it okay if that's tracked (via https://github.com/NixOS/nixpkgs/issues/341650) and added later?	23:07:24
14 Sep 2024
	SomeoneSerge (back on matrix) changed their display name from SomeoneSerge (nix.camp) to SomeoneSerge (utc+3).	11:37:51
	kaya 𖤐 changed their profile picture.	20:26:46
15 Sep 2024
@adam:robins.wtf	In reply to @connorbaker:matrix.org It is surprising to me that 0.3.5 works with CUDA 12 at all; I guess there were no breaking API changes on stuff they relied on? Ok, so I think all the version stuff was a red herring. I believe I've found the culprit, which is that this derivation isn't ending up in the final nixos system. https://github.com/NixOS/nixpkgs/blob/345c263f2f53a3710abe117f28a5cb86d0ba4059/pkgs/by-name/ol/ollama/package.nix#L122	17:33:36
@adam:robins.wtf	I run ollama in an incus(lxc) container with the GPU passed in, but I don't build the system configuration on that host	17:34:12
@adam:robins.wtf	manually copying it over from the build hosts allows ollama to successfully work	17:34:27
@adam:robins.wtf	* manually copying it over from the build host allows ollama to successfully work	17:34:37
SomeoneSerge (back on matrix)	In reply to @connorbaker:matrix.org To clarify SomeoneSerge (nix.camp), do you want a test like that in the OpenCV PR, or is it okay if that's tracked (via https://github.com/NixOS/nixpkgs/issues/341650) and added later? Ouch, I thought I had replied. Just a manual test is sufficient, but also needed because I suppose we do want to make sure opencv actually works after merging?	19:00:51
SomeoneSerge (back on matrix)	In reply to @adam:robins.wtf manually copying it over from the build host allows ollama to successfully work Could you elaborate for the others, what is it that needs to be manually copied?	19:01:34
@adam:robins.wtf	https://github.com/NixOS/nixpkgs/pull/342127 should fix it	19:08:33
@adam:robins.wtf	it's that cudaToolkit/cuda-merged env that wasn't being included	19:08:56
SomeoneSerge (back on matrix)	In reply to @adam:robins.wtf it's that cudaToolkit/cuda-merged env that wasn't being included Ehh it shouldn't be included	19:09:27
@adam:robins.wtf	well, i'm open to other fixes, but without that env it fails to find cuda_cudart.so.12	19:10:08
@adam:robins.wtf	that's the same env being used to build ollama against cuda, so i assume it's expecting the files to be there at runtime too	19:10:32
@adam:robins.wtf	* well, i'm open to other fixes, but without that env it fails to find `lib_cudart.so.12`	19:14:21

Show newer messages

Back to Room ListRoom Version: 9