| 9 May 2026 |
emily | I guess the queue runner would observe the output is missing and schedule another build? | 19:25:42 |
emily | right. so that could be fixed at the queue runner level? "if we have an output waiting to be uploaded, then don't spawn another build; just keep trying to upload that output"? | 19:26:14 |
Sergei Zimmerman (xokdvium) | In reply to @emilazy:matrix.org right. so that could be fixed at the queue runner level? "if we have an output waiting to be uploaded, then don't spawn another build; just keep trying to upload that output"? Makes sense yeah | 19:26:57 |
Sergei Zimmerman (xokdvium) | Things probably get complicated when the builder dies halfway? | 19:27:29 |
emily | as in half-way through sending its successfully-built outputs to the queue runner? | 19:27:57 |
K900 | Then it should probably discard the entire build | 19:28:13 |
emily | yeah, though I think the issue is potentially that stuff happens per-output? | 19:28:33 |
emily | "waiting for all outputs to be ready for upload before uploading any of them" would be good | 19:28:54 |
K900 | I wonder if there's any reasonable way to do two phase commit on this | 19:29:59 |
K900 | Like upload to a temporary directory and then move atomically | 19:30:07 |
K900 | If S3 lets you do that | 19:30:15 |
emily | yeah, if we could have S3 expose all the narinfos atomically that would be great | 19:30:20 |
emily | I mean even just uploading all the actual outputs first and then uploading the narinfos would probably help | 19:30:28 |
emily | I don't really understand why Hydra would decide to build something it's trying to upload anyway though | 19:32:03 |
emily | does it ever give up on retrying to upload? | 19:32:10 |
emily | like, anything that wants something in the process of uploading should block on that upload anyway, right? | 19:32:52 |
emily | so perhaps the solution could be as simple as just "never stop retrying uploads"? it wouldn't handle the "builder disappears" case Sergei Zimmerman (xokdvium) mentioned but that's at least an edge case | 19:33:24 |
emily | hexa (signing key rotation when): can we get queue runner logs for musikcube | 19:46:16 |
emily | https://github.com/NixOS/nixpkgs/issues/517508 is recent | 19:46:27 |
emily | though, let me check it's not a library dependency actually | 19:46:39 |
hexa | no entries | 19:47:42 |
hexa | 5 tries iirc | 19:48:24 |
emily | seems like we should just make that infinite? | 19:50:46 |
emily | if you give up on uploading and rebuild instead, you then still have to upload the output, so no benefit | 19:50:58 |
emily | ok, it's actually ffmpeg-headless on unstable, last rebuilt 2026-04-23 | 19:53:53 |
emily | are these logs from the old or new queue runner? having trouble chasing code paths | 20:00:54 |
hexa | old | 20:01:29 |
hexa | the new runner isn't live | 20:01:36 |
emily | ok, so it looks like Nix will retry up to download-attempts (even for uploads) times, unless it gets status 400–500 other than 408, 501, 505, or 511 | 20:18:09 |
hexa | and I assume you only got that from code and not docs | 20:18:42 |