| 9 May 2026 |
hexa | but I also don't want an extended backlog of uploads ideally | 20:19:55 |
emily | right, but they'll happen anyway right? | 20:20:11 |
hexa | we can increase the retry amounts | 20:20:12 |
emily | they're ultimately part of the jobset | 20:20:20 |
hexa | except when the ydon't | 20:20:22 |
emily | I guess the difference is it can give up on leafs | 20:20:26 |
hexa | huh | 20:20:28 |
hexa | they? | 20:20:36 |
emily | the things being uploaded | 20:20:48 |
hexa | right | 20:20:52 |
emily | I think a nicer solution is ^ where you just never push out a .narinfo for any output until all the outputs are up | 20:21:13 |
emily | but looking at the C++ code it doesn't seem like that would be trivial to arrange if S3 can even do it | 20:21:29 |
emily | and obviously I don't know how the new queue runner will handle uploads (maybe John Ericson does) | 20:21:44 |
hexa | Simon Hauser would know | 20:22:08 |
emily | it seems like just increasing the number of retries in the Nix config specifically used by the queue runner would likely mitigate this problem in practice for now | 20:22:13 |
hexa | fair enough | 20:22:29 |
hexa | let's say 32 instead of 1024 though | 20:22:37 |
John Ericson | I do agree with that sort of thing | 20:23:49 |
John Ericson | (well in the CA case, it would be the build trace entry, but I digress :)) | 20:24:04 |
emily | in the CA case the Darwin builds will be broken 100% of the time so there'll be much less debugging required to find the root cause :P | 20:24:37 |
John Ericson | no more automatic rewriting in CA soon! :) | 20:26:12 |
emily | how is that going to work? | 20:26:42 |
John Ericson | https://github.com/NixOS/nix/pull/15793 see what Artemis has been working on | 20:27:51 |
John Ericson | (note that the use of the exing protocol vs a new simpler protocol is provisional) | 20:28:07 |
hexa | how annoying would it be to move the S3 bucket to Europe btw 😆 | 20:28:54 |
emily | I don't see how this solves every dylib on Darwin encoding its own path? | 20:29:51 |
emily | "The builder can learn the calculated content-addressed path of one of its outputs before creating others" sounds like it can only prevent breaking cross-output references (and since you usually need to know all paths upfront at ./configure time probably not even then) | 20:30:12 |
emily | but offtopic, anyway :) | 20:30:14 |
| @squalus0:matrix.org left the room. | 21:21:39 |
| 10 May 2026 |
VladimÃr ÄŒunát | In reply to @emilazy:matrix.org right, okay. I don't know exactly why it happens but I know vcunat has mentioned it happening multiple times before. To me the behavior looked like insufficient deduplication of the situations where different hydra jobs built the same derivation, e.g. on different jobsets. It seemed easy to happen. We're helped by the ratio of job count to build times usually, as the duplication only happens concurrently. And for linux we get helped by the builder pool being small and split along the big-parallel axis, so often the duplicate got scheduled to the same machine and deduplicated locally by nix in there. | 06:27:16 |