| 9 May 2026 |
emily | since the NARs themselves are content-addressed | 17:43:22 |
Sergei Zimmerman (xokdvium) | Yeah, narinfo is the one to look out for then | 17:43:40 |
emily | basically if it's on the download end then we could boost the substitution timeouts or number of attempts. if it's on the upload end then we could boost the upload timeouts or number of attempts. in both cases it'd be quite nice if we could rule out the partial case altogether of course. but that might be hard (e.g. can we make S3 atomically publish multiple .narinfo files? totally unclear to me) | 17:44:57 |
emily | oh bingo | 17:46:22 |
emily | for FFmpeg: | 17:46:23 |
emily | shion:~
❭ curl -si https://cache.nixos.org/g7gwxb7jinbidq6j0h0fd95rf6zc8937.narinfo | rg last-modified
last-modified: Wed, 25 Mar 2026 22:19:57 GMT
shion:~
❭ for hash in lspxkkbmyvzp36jbvjvy3a3d1j979iqb 6a5nr567sb4a36lisa6gydpp3bfij1vv 60hnmkly9hdsn0ajqmqf2lmd3vnf5w94 gpf5ks0x6x2ih4jjasp53cmx0cmk1bbw hn58l3pvn5iwq87p6ddp9wsw8ai9dl93 j6mqv1jx0pvkz3ww8j3mk65pfg5cc4pi; curl -si https://cache.nixos.org/$hash.narinfo | rg last-modified; end
last-modified: Wed, 25 Mar 2026 22:46:24 GMT
last-modified: Wed, 25 Mar 2026 22:46:24 GMT
last-modified: Wed, 25 Mar 2026 22:46:27 GMT
last-modified: Wed, 25 Mar 2026 22:46:33 GMT
last-modified: Wed, 25 Mar 2026 22:46:17 GMT
last-modified: Wed, 25 Mar 2026 22:46:34 GMT
| 17:46:34 |
emily | that's surely one build for the data output and one build for the rest. as the log attests to of course | 17:46:59 |
emily | hexa (signing key rotation when): once a given .narinfo is pushed, can it ever be overwritten? i.e. if the same output gets built again | 17:47:19 |
emily | or would the second write get dropped? | 17:47:24 |
Sergei Zimmerman (xokdvium) | In the old queue runner I think it was using the nix’s binary cache store so it wouldn’t get pushed over | 17:47:58 |
emily | for logs too? | 17:48:22 |
Sergei Zimmerman (xokdvium) | What the new non-deployed code does is unclear to me | 17:48:23 |
emily | since the log for fish shows the non-fallback paths, but the log for ffmpeg shows the data output as an odd-one-out fallback path | 17:48:47 |
emily | but they both seem to be broken for path-rewrite-related reasons | 17:48:54 |
Sergei Zimmerman (xokdvium) | Hm logs always get pushed over | 17:49:04 |
emily | hm, if it's just doing it through Nix, how come it's the queue runner talking about uploading them in https://termbin.com/69iy? | 17:49:33 |
emily | I thought all the compression/signing/uploads were done on the central queue runner machine | 17:49:41 |
Sergei Zimmerman (xokdvium) | It used to be linked to nix-store | 17:50:18 |
emily | ah, I see what you mean. (I thought you meant the builders were using that store directly) | 17:50:53 |
emily | so "failure at the time of upload" sounds very plausible to me. especially given that Nix retries substitutions a bunch out of the box, whereas these queue runner logs look like it's not retrying at all | 17:51:27 |
emily | (so download side should be expected to be more robust by default?) | 17:51:47 |
emily | so, uh… does the new queue runner retry uploads? | 17:52:27 |
Sergei Zimmerman (xokdvium) | Tbh it’s not exaxy clear to me. I thought that it was supposed to be doing presigned URLs and the builders would be the ones uploading | 17:53:08 |
K900 | That's not actually implemented | 17:53:23 |
K900 | AFAIUI | 17:53:25 |
K900 | And also I don't see how that would even help because you also need to sign the actual NAR | 17:53:36 |
K900 | Which the builders don't have keys for | 17:53:42 |
K900 | So you'd need a custom protocol for the builders to ask the coordinator to sign the NAR and then you need to figure out how to actually authenticate the builder ideally with something like SPIFFE and that's a whole other can of worms | 17:54:14 |
Sergei Zimmerman (xokdvium) | In reply to @emilazy:matrix.org so "failure at the time of upload" sounds very plausible to me. especially given that Nix retries substitutions a bunch out of the box, whereas these queue runner logs look like it's not retrying at all Hm queue runner is supposed to be retrying to upload the same thing - at least nix binary cache store does this. Whether we are not doing it well enough is another question | 17:56:58 |
Sergei Zimmerman (xokdvium) | But from the logs portion seems like retries do succeed after a couple of attempts | 17:57:21 |