!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

422 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.132 Servers

Load older messages


SenderMessageTime
9 May 2026
@hexa:lossy.networkhexabut I also don't want an extended backlog of uploads ideally20:19:55
@emilazy:matrix.orgemilyright, but they'll happen anyway right?20:20:11
@hexa:lossy.networkhexawe can increase the retry amounts20:20:12
@emilazy:matrix.orgemilythey're ultimately part of the jobset20:20:20
@hexa:lossy.networkhexaexcept when the ydon't20:20:22
@emilazy:matrix.orgemilyI guess the difference is it can give up on leafs20:20:26
@hexa:lossy.networkhexahuh20:20:28
@hexa:lossy.networkhexathey?20:20:36
@emilazy:matrix.orgemilythe things being uploaded20:20:48
@hexa:lossy.networkhexaright20:20:52
@emilazy:matrix.orgemily I think a nicer solution is ^ where you just never push out a .narinfo for any output until all the outputs are up 20:21:13
@emilazy:matrix.orgemilybut looking at the C++ code it doesn't seem like that would be trivial to arrange if S3 can even do it20:21:29
@emilazy:matrix.orgemily and obviously I don't know how the new queue runner will handle uploads (maybe John Ericson does) 20:21:44
@hexa:lossy.networkhexa Simon Hauser would know 20:22:08
@emilazy:matrix.orgemilyit seems like just increasing the number of retries in the Nix config specifically used by the queue runner would likely mitigate this problem in practice for now20:22:13
@hexa:lossy.networkhexafair enough20:22:29
@hexa:lossy.networkhexalet's say 32 instead of 1024 though20:22:37
@Ericson2314:matrix.orgJohn EricsonI do agree with that sort of thing20:23:49
@Ericson2314:matrix.orgJohn Ericson(well in the CA case, it would be the build trace entry, but I digress :))20:24:04
@emilazy:matrix.orgemilyin the CA case the Darwin builds will be broken 100% of the time so there'll be much less debugging required to find the root cause :P20:24:37
@Ericson2314:matrix.orgJohn Ericsonno more automatic rewriting in CA soon! :)20:26:12
@emilazy:matrix.orgemilyhow is that going to work?20:26:42
@Ericson2314:matrix.orgJohn Ericsonhttps://github.com/NixOS/nix/pull/15793 see what Artemis has been working on20:27:51
@Ericson2314:matrix.orgJohn Ericson(note that the use of the exing protocol vs a new simpler protocol is provisional)20:28:07
@hexa:lossy.networkhexahow annoying would it be to move the S3 bucket to Europe btw 😆20:28:54
@emilazy:matrix.orgemilyI don't see how this solves every dylib on Darwin encoding its own path?20:29:51
@emilazy:matrix.orgemily "The builder can learn the calculated content-addressed path of one of its outputs before creating others" sounds like it can only prevent breaking cross-output references (and since you usually need to know all paths upfront at ./configure time probably not even then) 20:30:12
@emilazy:matrix.orgemilybut offtopic, anyway :)20:30:14
@squalus0:matrix.org@squalus0:matrix.org left the room.21:21:39
10 May 2026
@vcunat:matrix.orgVladimír Čunát
In reply to @emilazy:matrix.org
right, okay. I don't know exactly why it happens but I know vcunat has mentioned it happening multiple times before.
To me the behavior looked like insufficient deduplication of the situations where different hydra jobs built the same derivation, e.g. on different jobsets. It seemed easy to happen. We're helped by the ratio of job count to build times usually, as the duplication only happens concurrently. And for linux we get helped by the builder pool being small and split along the big-parallel axis, so often the duplicate got scheduled to the same machine and deduplicated locally by nix in there.
06:27:16

Show newer messages


Back to Room ListRoom Version: 6