10 Jul 2025 |
| Fred Lahde joined the room. | 18:47:41 |
Arian | Yeh so stuff is really off:
https://cache.nixos.org/h35hs85vd5nhrzv3j03ybdfz2s1wsc6l.narinfo
(200) takes between 20ms and 90ms to resolve for me
https://cache.nixos.org/lolno.narinfo
(404) takes consistently between 120 and 230ms for me
so we are not caching 404s and they're really slow | 18:51:27 |
Arian | And looking at the generated VCL I indeed think we can fix it by changing cache_condition to response_condition
the cache_condition gets executed before a hit/miss is decided. This means that we return the fixed 404 response before varnish even makes decision on whether to cache or not and exits out of the VCL
| 18:52:29 |
Arian | so we never hit the code-path for caching | 18:52:33 |
Arian | "How I made `nixos-rebuild switch 200% faster for everyone with this one weird trick" | 18:53:14 |
Arian | I just need to know if this was deliberately set up like this. Are we on purpose not caching 404s or by accident? | 18:53:29 |
Jeremy Fleischman (jfly) | It sure looks like the intent was to cache 404s. Here's the initial import into terraform, which has the "cache 404s" code: https://github.com/NixOS/infra/commit/ee995c5f3fee6d645a4a8fb9a93c57f3763b9f07#diff-75e932ae3525435283fff74680b6af8d8c83df93a23b10c7f0a9fcf0a6e4f3e9R179-R184 | 18:56:46 |
Arian | yeh so it actually does the opposite. 404s are cached by default by fastly and this breaks that :D | 18:57:08 |
Jeremy Fleischman (jfly) | i say, go delete some code | 18:57:40 |
Arian | well I think we still maybe want to replace the 404 payload with the string 404 otherwise we get some ugly XML blob from S3 | 18:58:31 |
Arian | but we should do it later in the VCL | 18:58:39 |
Zhaofeng Li | wait, do we have some kind of post-build-hook/s3 hook/etc to bust the cache after paths are built? | 18:59:03 |
emily | are you saying every Nix build in the universe is way slower than it should be because it's hitting S3 | 18:59:32 |
Arian | yes | 18:59:37 |
emily | (…does S3 bill for 404s?) | 18:59:39 |
Arian | Yes S3 bills for 404s | 18:59:46 |
emily | lol | 18:59:49 |
Arian | they even used to bill for authorization errors so you could just rack up anyone's bill by knowing their bucket name | 19:00:02 |
Arian | they changed that now | 19:00:04 |
emily | please run some numbers on how much of the cache size bill this is, I'm so curious | 19:00:13 |
Jeremy Fleischman (jfly) | no, but the intent is to "only" cache 404s for 24 hours | 19:00:14 |
Arian | I have these numbers. I don't think API Calls are a large portion of our cost | 19:00:36 |
emily | that might not be great UX: the channel scripts run to bump channels after the final builds complete | 19:00:41 |
emily | so anyone who has been running master before that will have cached 404s for first day everyone is bumping to the new channel revision | 19:00:58 |
Zhaofeng Li | contributors can try to build locally and trigger the negative caching, which would be bad UX | 19:01:07 |
Zhaofeng Li | yeah | 19:01:10 |
Arian | But this is already a problem. Nix caches 404s locally | 19:01:18 |
emily | it's not a problem across users | 19:01:25 |
Zhaofeng Li | but this would affect it for everyone | 19:01:29 |
emily | most users don't try to run master , but some do | 19:01:30 |