| 9 May 2026 |
hexa | races are absolutely a possiblity | 17:21:47 |
hexa | note that I tried out the new queue-runner at least three times in the last two weeks | 17:22:00 |
emily | these issues have been present for years | 17:22:11 |
hexa | good | 17:22:14 |
emily | but getting worse in the past, say, couple months? | 17:22:18 |
emily | much worse | 17:22:31 |
Sergei Zimmerman (xokdvium) | Hm is S3 still very bad on the on queue runner? | 17:23:09 |
emily | I think it would require
- builder A starts building
fish
- builder A finishes building
fish
- builder A uploads build log and
fish^doc
- builder B substitutes
fish^doc
- builder B starts building
fish
- builder B uploads
fish^out
- builder A doesn't get to upload
fish^out
| 17:23:56 |
emily | seems pretty contrived to me, the entire second build has to happen between builder A starting to upload and finishing (and what would be pulling in fish^doc to begin with?) | 17:24:17 |
hexa | nobody changed anything about s3, hydra repo has migrated to the new queue-runner i march | 17:24:41 |
emily | seems more likely for the upload of fish^out to just hard fail the first time and then a second builder later picks it up and mangles it | 17:24:45 |
hexa | Redacted or Malformed Event | 17:24:46 |
hexa | the old queue-runner is gone from the repo | 17:24:52 |
hexa | yeah, same reaction | 17:25:07 |
hexa | I suddenly had to pin hydra without any prior communication | 17:25:18 |
emily | do logs exist for an attempt to push out a given store path and whether it succeeded? | 17:26:11 |
emily | or for multiple builds of a derivation that happen? like can we get data about whether fish's outputs/logs on the cache for that one derivation are actually chimerical between two separate builds? | 17:26:40 |
Sergei Zimmerman (xokdvium) | Hm maybe last-modified could be a rough approximation? | 17:27:33 |
emily | we run into this issue ~every staging cycle now it feels like from what I've seen, and it's gone from "once or twice a year" to "constantly hitting users" it seems | 17:27:36 |
emily | it also holds up cycles because it breaks downstream builds etc. | 17:28:03 |
hexa | yeah | 17:34:45 |
hexa | from may 7th 21:56 right now | 17:35:04 |
hexa | so less than 2 days | 17:35:15 |
hexa | https://termbin.com/69iy | 17:36:19 |
hexa | all fish related things | 17:36:22 |
emily | yeah, not quite the retention we'd need to catch this :( | 17:38:25 |
emily | we could check it next time we observe this cropping up in a staging-next cycle | 17:38:35 |
emily | does S3 retain something like ^? like the date a bucket entry was added/modified? | 17:38:49 |
emily | May 08 04:16:49 mimas hydra-queue-runner[1392977]: warning: unable to upload 'https://nix-cache.s3.us-east-1.amazonaws.com/log/xsvcvrzr8v1p7jpldddr8wkmaz84knpi-config.fish.drv': Timeout was reached (28) Connection timed out after 17368 milliseconds; retrying in 275 ms (attempt 1/5)
| 17:39:21 |
emily | so failures definitely do happen | 17:39:29 |