| 9 May 2026 |
emily | are these logs from the old or new queue runner? having trouble chasing code paths | 20:00:54 |
hexa (signing key rotation when) | old | 20:01:29 |
hexa (signing key rotation when) | the new runner isn't live | 20:01:36 |
emily | ok, so it looks like Nix will retry up to download-attempts (even for uploads) times, unless it gets status 400–500 other than 408, 501, 505, or 511 | 20:18:09 |
hexa (signing key rotation when) | and I assume you only got that from code and not docs | 20:18:42 |
emily | would it be feasible to set download-attempts = 1024 or something like that on the Nix used by Hydra? | 20:18:48 |
hexa (signing key rotation when) | that would be hack, right? | 20:19:00 |
emily | yes I had to bounce between multiple repositories 🫠 | 20:19:02 |
emily | well it seems reasonable to say that Hydra giving up on an upload just never makes sense | 20:19:18 |
hexa (signing key rotation when) | I kinda disagree | 20:19:27 |
emily | if it gives up on uploading something to the cache, then it's just going to schedule a pointless build for it later, and then try to upload that | 20:19:29 |
hexa (signing key rotation when) | that part is true | 20:19:44 |
emily | which is exactly the same as continuing to try to upload, except that you do a pointless build which happens to also break things on Darwin | 20:19:46 |
hexa (signing key rotation when) | but I also don't want an extended backlog of uploads ideally | 20:19:55 |
emily | right, but they'll happen anyway right? | 20:20:11 |
hexa (signing key rotation when) | we can increase the retry amounts | 20:20:12 |
emily | they're ultimately part of the jobset | 20:20:20 |
hexa (signing key rotation when) | except when the ydon't | 20:20:22 |
emily | I guess the difference is it can give up on leafs | 20:20:26 |
hexa (signing key rotation when) | huh | 20:20:28 |
hexa (signing key rotation when) | they? | 20:20:36 |
emily | the things being uploaded | 20:20:48 |
hexa (signing key rotation when) | right | 20:20:52 |
emily | I think a nicer solution is ^ where you just never push out a .narinfo for any output until all the outputs are up | 20:21:13 |
emily | but looking at the C++ code it doesn't seem like that would be trivial to arrange if S3 can even do it | 20:21:29 |
emily | and obviously I don't know how the new queue runner will handle uploads (maybe John Ericson does) | 20:21:44 |
hexa (signing key rotation when) | Simon Hauser would know | 20:22:08 |
emily | it seems like just increasing the number of retries in the Nix config specifically used by the queue runner would likely mitigate this problem in practice for now | 20:22:13 |
hexa (signing key rotation when) | fair enough | 20:22:29 |
hexa (signing key rotation when) | let's say 32 instead of 1024 though | 20:22:37 |