| 19 May 2024 |
connor (he/him) | Eh I don't know about hardware :( | 14:43:19 |
connor (he/him) | I will say though -- I thought AMD's x3D chips would provide a performance boost to compilation workloads, but that was not the case. So if you go for HEDT instead of professional-grade stuff, I think the 7950x would perform better than the 7950x3D. | 14:44:12 |
Gaétan Lepage | That's really interesting ! | 14:45:46 |
Gaétan Lepage | cores = 0 means "automatic" ? | 14:45:55 |
Gaétan Lepage | Right now, I use one remote machine on which I ssh to code (has the nixpkgs clone).
It is also where I run nixpkgs-review from, so it is in charge of the eval.
Then, it uses another builder to perform the actual builds. | 14:48:01 |
Gaétan Lepage | I don't develop directly from my laptop, because evaluation can themselves be quite heavy. | 14:48:23 |
connor (he/him) | Yes, cores = 0 is automatic. Weird that they didn't use cores = auto like they did with max-jobs. | 14:50:57 |
Gaétan Lepage | Ok | 14:52:00 |
connor (he/him) | Oh yeah tell me about it -- part of the reason I switched to 96GB of RAM was because nixpkgs-review kept filling up my ZRAM just during evaluation. Although, I did learn that I get a compression ratio of about 5:1 when I set ZRAM to use ZSTD! | 14:52:04 |
Gaétan Lepage | Oh wow | 14:52:52 |
Gaétan Lepage | The price difference between 7950x and 7960x is quite massive... | 14:56:49 |
connor (he/him) | The 7950x is a consumer-grade desktop part, the 7960x is part of AMD's HEDT offerings IIRC, so they charge a premium for it | 15:08:30 |
Gaétan Lepage | Yes, quite a premium | 15:09:01 |
SomeoneSerge (matrix works sometimes) | Well it was meant as an epsilon=10 approximation xDD Point being, it's weeks of running the CI, rather than, say, years? | 15:11:37 |
connor (he/him) | aidalgol: running nix-cuda-test I see it on my nvidia-smi
$ nvidia-smi
Sun May 19 15:11:11 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78 Driver Version: 550.78 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off |
| 45% 56C P2 347W / 500W | 8187MiB / 24564MiB | 96% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3656630 C ...y88kh-python3-3.11.9/bin/python3.11 8180MiB |
+-----------------------------------------------------------------------------------------+
| 15:11:45 |
SomeoneSerge (matrix works sometimes) | Yessss absolutely outrageous | 15:12:33 |
connor (he/him) | The hbv3 absolutely chugs through the first part of the magma-cuda-static build, which involves building all the C++ objects (the first 2745/3430 of objects). However, it seems there aren't as many CUDA objects (or their dependencies prevent as many from being built in parallel as the C++ objects), and they take a long time to build, so instructions per cycle wins over number of cores. Look at all my cores! So few are being used :( | 15:51:41 |
connor (he/him) |  Download Screenshot 2024-05-19 at 11.46.48 AM.png | 15:51:50 |
connor (he/him) | Oh my god | 15:57:36 |
connor (he/him) | real 17m29.002s
user 0m2.368s
sys 0m2.890s
| 15:57:42 |
connor (he/him) | Okay so I guess the higher clockspeed combined with the limited parallelism when building the CUDA objects results in it being only 2m faster than my i9-13900k | 15:58:43 |
connor (he/him) | Also:
error: derivation '/nix/store/krfxsgln7gispk9lnfpiav36wja2sg9x-magma-2.7.2.drv' may not be deterministic: output '/nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2' differs
| 15:59:02 |
SomeoneSerge (matrix works sometimes) | Oh nice. Can you save a diffoscope before it's GCed? | 15:59:30 |
connor (he/him) | Sure! How do I do that? | 16:03:16 |
SomeoneSerge (matrix works sometimes) | I'm not sure if NIx does it without the --rebuild/--check option, but there should be another path beside /nix/store/gmwh...-magma-2.7.2. Something with a suffix (maybe .check) | 16:08:42 |
connor (he/him) | $ ls -1 /nix/store/*-magma-*
/nix/store/3qk2k6g7wpidmy0rs8gilqkmy14821ns-magma-2.7.2.tar.gz.drv
/nix/store/6482b0xigkghwkx5fl97y85xqclcga96-magma-2.7.2-test.lock
/nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2.lock
/nix/store/icmm2apcmxxl4zvx5k75ya8aj3n72ifm-magma-2.7.2.tar.gz
/nix/store/krfxsgln7gispk9lnfpiav36wja2sg9x-magma-2.7.2.drv
/nix/store/gmwhmzv4ppjmrwzicdww0r1nfzzhnm34-magma-2.7.2:
include
lib
| 16:09:42 |
SomeoneSerge (matrix works sometimes) | You nix run nixpkgs#diffoscope -- /nix/store/gm...-magma-2.7.2. /nix/store/...-magma-2.7.2.check. There's a flag to export e.g. an html | 16:09:47 |
SomeoneSerge (matrix works sometimes) | Damn. I guess it threw it away then:) | 16:10:24 |
SomeoneSerge (matrix works sometimes) | A perfectly sensible behaviour after spending 19 minutes of compute | 16:10:54 |
connor (he/him) | I mean I have three desktops I can run three builds of it in parallel lol | 16:11:15 |