| 13 Oct 2025 |
teutat3s | Seems to fail reliably 😄 I see this again:
fatal error: error in backend: IO failure on output stream: No space left on device
| 19:43:22 |
teutat3s | * Thanks for restarting the build. It seems to fail reliably 😄 I see this again:
fatal error: error in backend: IO failure on output stream: No space left on device
| 19:43:54 |
vcunat | What do you mean? The build succeded on Hydra now. | 19:49:54 |
vcunat | * What do you mean? The build succeeded on Hydra now. | 19:50:00 |
vcunat | https://hydra.nixos.org/build/310015857 | 19:50:07 |
vcunat | "No space left on device" can be reliable if you still have similar (low) amount of space. | 19:50:46 |
teutat3s | Nice, thanks. I must have looked at the wrong job, sorry for the noise.
How do you get to that build? | 19:51:55 |
vcunat | Just deleted a suffix of your URL. | 19:52:40 |
vcunat | * Just deleted a suffix of your URL 😉 | 19:52:56 |
| 14 Oct 2025 |
Sami Liedes | Now that I've been running world rebuilds on my home workstation... does it ever happen in hydras that the builds get essentially serialized by something like binutils ./configure? Or is there always enough derivations to build that this doesn't happen? (I think it happens somewhat necessarily if rebuilding a single nixos, but if I was for example building nixos for 8 architectures on the same machine, I'm pretty sure I could saturate all cores.) | 17:10:41 |
K900 | Yes | 17:11:34 |
Sami Liedes | I looked into nix cgroups. What a mess. I have this hypothesis that this would happen much less if it used cgroups so that each builder (not each process of each builder) gets equal cpu. | 17:11:33 |
K900 | stdenv builds are very linear | 17:11:39 |
K900 | cgroups don't help this | 17:11:58 |
K900 | A lot of things just don't scale out | 17:12:04 |
emily | Hydra builds stdenv on staging, so by the time we do staging-next that part is frequently already done. after stdenv, it fans out pretty quickly | 17:12:29 |
emily | once you have gotten through e.g. Rust, LLVM, CMake, Meson you are off to the races and will never want for jobs. | 17:12:45 |
vcunat | Most x86_64-linux jobs are only built with -j2 on Hydra.nixos.org currently. | 17:13:00 |
vcunat | We primarily scale by running many jobs concurrently. | 17:13:16 |
Sami Liedes | I think in my case they would, but that happens probably because I have erred on the higher side of average load. The point is that the serialized configure processes also get seriously throttled (don't get a full core), whereas it would be ideal if serialized stages of derivations got the CPU they want, which I believe would be achievable by balancing the builders/derivations instead of processes. | 17:13:48 |
K900 | If you're saturating the CPU, it doesn't matter what you're saturating it with | 17:14:22 |
K900 | You'll have the same total build time anyway | 17:14:33 |
vcunat | It could matter. | 17:14:49 |
Sami Liedes | Sure it does. You finish fastest if you allocate max cpu to the critical path. | 17:14:56 |
vcunat | * It could matter. I mean, it could affect CPU saturation in future. | 17:15:09 |
Sami Liedes | And I see that happen in practice with my builds; some serialized configure gets throttled because lots of derivations build, then it runs out of derivations and waits until that is finished, and then explodes again. | 17:15:45 |
vcunat | Hydra has so much work that critical path matters little in practice. | 17:16:03 |
vcunat | (Hundreds of thousands jobs.) | 17:16:28 |
K900 | At Hydra scale the critical path finishes long before we're out of work in pretty much all cases | 17:17:07 |
Sami Liedes | Right. That I can imagine (and is what I asked for :). So not building e.g. only a single architecture of a single channel helps presumably a lot. | 17:17:06 |