NixOS CUDA - Public Room Timeline

	NixOS CUDA	311 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	61 Servers

Load older messages

Sender	Message	Time
19 May 2024
connor (he/him)	Eh I don't know about hardware :(	14:43:19
connor (he/him)	I will say though -- I thought AMD's x3D chips would provide a performance boost to compilation workloads, but that was not the case. So if you go for HEDT instead of professional-grade stuff, I think the 7950x would perform better than the 7950x3D.	14:44:12
Gaétan Lepage	That's really interesting !	14:45:46
Gaétan Lepage	`cores = 0` means "automatic" ?	14:45:55
Gaétan Lepage	Right now, I use one remote machine on which I ssh to code (has the nixpkgs clone). It is also where I run `nixpkgs-review` from, so it is in charge of the eval. Then, it uses another builder to perform the actual builds.	14:48:01
Gaétan Lepage	I don't develop directly from my laptop, because evaluation can themselves be quite heavy.	14:48:23
connor (he/him)	Yes, `cores = 0` is automatic. Weird that they didn't use `cores = auto` like they did with `max-jobs`.	14:50:57
Gaétan Lepage	Ok	14:52:00
connor (he/him)	Oh yeah tell me about it -- part of the reason I switched to 96GB of RAM was because `nixpkgs-review` kept filling up my ZRAM just during evaluation. Although, I did learn that I get a compression ratio of about 5:1 when I set ZRAM to use ZSTD!	14:52:04
Gaétan Lepage	Oh wow	14:52:52
Gaétan Lepage	The price difference between 7950x and 7960x is quite massive...	14:56:49
connor (he/him)	The 7950x is a consumer-grade desktop part, the 7960x is part of AMD's HEDT offerings IIRC, so they charge a premium for it	15:08:30
Gaétan Lepage	Yes, quite a premium	15:09:01
SomeoneSerge (matrix works sometimes)	Well it was meant as an epsilon=10 approximation xDD Point being, it's weeks of running the CI, rather than, say, years?	15:11:37
connor (he/him)	aidalgol: running `nix-cuda-test` I see it on my nvidia-smi $ nvidia-smi Sun May 19 15:11:11 2024 +-----------------------------------------------------------------------------------------+ \| NVIDIA-SMI 550.78 Driver Version: 550.78 CUDA Version: 12.4 \| \|-----------------------------------------+------------------------+----------------------+ \| GPU Name Persistence-M \| Bus-Id Disp.A \| Volatile Uncorr. ECC \| \| Fan Temp Perf Pwr:Usage/Cap \| Memory-Usage \| GPU-Util Compute M. \| \| \| \| MIG M. \| \|=========================================+========================+======================\| \| 0 NVIDIA GeForce RTX 4090 Off \| 00000000:01:00.0 Off \| Off \| \| 45% 56C P2 347W / 500W \| 8187MiB / 24564MiB \| 96% Default \| \| \| \| N/A \| +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ \| Processes: \| \| GPU GI CI PID Type Process name GPU Memory \| \| ID ID Usage \| \|=========================================================================================\| \| 0 N/A N/A 3656630 C ...y88kh-python3-3.11.9/bin/python3.11 8180MiB \| +-----------------------------------------------------------------------------------------+	15:11:45
SomeoneSerge (matrix works sometimes)	Yessss absolutely outrageous	15:12:33
connor (he/him)	The hbv3 absolutely chugs through the first part of the `magma-cuda-static` build, which involves building all the C++ objects (the first 2745/3430 of objects). However, it seems there aren't as many CUDA objects (or their dependencies prevent as many from being built in parallel as the C++ objects), and they take a long time to build, so instructions per cycle wins over number of cores. Look at all my cores! So few are being used :(	15:51:41
connor (he/him)	Download Screenshot 2024-05-19 at 11.46.48 AM.png	15:51:50
connor (he/him)	Oh my god	15:57:36
connor (he/him)	`real 17m29.002s user 0m2.368s sys 0m2.890s`	15:57:42
connor (he/him)	Okay so I guess the higher clockspeed combined with the limited parallelism when building the CUDA objects results in it being only 2m faster than my i9-13900k	15:58:43
connor (he/him)	Also: `error: derivation '/nix/store/krfxsgln7gispk9lnfpiav36wja2sg9x-magma-2.7.2.drv' may not be deterministic: output '/nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2' differs`	15:59:02
SomeoneSerge (matrix works sometimes)	Oh nice. Can you save a diffoscope before it's GCed?	15:59:30
connor (he/him)	Sure! How do I do that?	16:03:16
SomeoneSerge (matrix works sometimes)	I'm not sure if NIx does it without the `--rebuild`/`--check` option, but there should be another path beside `/nix/store/gmwh...-magma-2.7.2`. Something with a suffix (maybe `.check`)	16:08:42
connor (he/him)	`$ ls -1 /nix/store/-magma- /nix/store/3qk2k6g7wpidmy0rs8gilqkmy14821ns-magma-2.7.2.tar.gz.drv /nix/store/6482b0xigkghwkx5fl97y85xqclcga96-magma-2.7.2-test.lock /nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2.lock /nix/store/icmm2apcmxxl4zvx5k75ya8aj3n72ifm-magma-2.7.2.tar.gz /nix/store/krfxsgln7gispk9lnfpiav36wja2sg9x-magma-2.7.2.drv /nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2: include lib`	16:09:42
SomeoneSerge (matrix works sometimes)	You `nix run nixpkgs#diffoscope -- /nix/store/gm...-magma-2.7.2. /nix/store/...-magma-2.7.2.check`. There's a flag to export e.g. an html	16:09:47
SomeoneSerge (matrix works sometimes)	Damn. I guess it threw it away then:)	16:10:24
SomeoneSerge (matrix works sometimes)	A perfectly sensible behaviour after spending 19 minutes of compute	16:10:54
connor (he/him)	I mean I have three desktops I can run three builds of it in parallel lol	16:11:15

Show newer messages

Back to Room ListRoom Version: 9