!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

350 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.108 Servers

Load older messages


SenderMessageTime
10 Jun 2025
@arianvp:matrix.orgArian *

I’m trying to understand our caching setup a bit better with Fastly<->S3, given bandwidth between S3 and fastly is our largest cost.

I noticed we’re not serving any Cache-Control headers on our NAR files, but I think we could set Cache-Control: immutable on all NAR objects in S3 as NAR files are content-addressed they should never change? Fastly respects these response headers to decide how long to cache things for.

Do I understand correctly that

https://github.com/NixOS/infra/blob/88f1c42e90ab88673ddde3bf973330fb2fcf23be/terraform/cache.tf#L138C17-L138C22

is the only thing configuring how long we hold things in cache? (seems to be 24 hours).

Given we also cache 404s on narinfos I guess that makes sense as we want them to be fast. ( and In case the narinfo gets uploaded later it invalidates it). But can’t we cache NARs way more aggressively than 24 hours? Would reduce the bandwidth on S3 perhaps.

11:06:41
@arianvp:matrix.orgArian *

I’m trying to understand our caching setup a bit better with Fastly<->S3, given bandwidth between S3 and fastly is our largest cost.

I noticed we’re not serving any Cache-Control headers on our NAR files, but I think we could set Cache-Control: immutable on all NAR objects in S3 as NAR files are content-addressed they should never change? Fastly respects these response headers it receives from S3 to decide how long to cache things for.

Do I understand correctly that

https://github.com/NixOS/infra/blob/88f1c42e90ab88673ddde3bf973330fb2fcf23be/terraform/cache.tf#L138C17-L138C22

is the only thing configuring how long we hold things in cache? (seems to be 24 hours).

Given we also cache 404s on narinfos I guess that makes sense as we want them to be fast. ( and In case the narinfo gets uploaded later it invalidates it). But can’t we cache NARs way more aggressively than 24 hours? Would reduce the bandwidth on S3 perhaps.

11:08:36
@arianvp:matrix.orgArian I guess even for 200 OK narinfos we could set Cache-Control: immutable. Just not for 404s 11:11:06
@emilazy:matrix.orgemilyFWIW I don't know how Fastly's cache expiration works but it's possible that longer caching could make things meaningfully faster too. I've noticed that fetching an ISO NAR from the cache goes at ~500 Mbit/s the first time and then maxes out my connection after that (last I checked the HTTP headers imply that it's already cached on Fastly but just not at my edge location but they don't seem to update right so I'm not sure if I should trust that)11:18:29
@emilazy:matrix.orgemilyno idea if the cap is Fastly–Fastly or Fastly–S3 or what, but just throwing it out there11:19:14
@arianvp:matrix.orgArian IDK if S3 supports setting Cache-Control: immutable on objects. But it for sure does support Cache-Control: max-age=XXX. We could also override the TTL for /nar path in VCL to increase the max-age to the maximum value that Fastly supports (seems to be a year) 11:28:32
@arianvp:matrix.orgArian * IDK if S3 supports setting Cache-Control: immutable on objects. But it for sure does support Cache-Control: max-age=XXX. We could also override the TTL for /nar path in VCL to increase the max-age to the maximum value that Fastly supports (seems to be a year). Because setting this on the S3 level would require changes to Nix to set those as request headers when uploading To s3. 11:29:02
@flokli:matrix.orgflokli
In reply to @arianvp:matrix.org
I guess even for 200 OK narinfos we could set Cache-Control: immutable. Just not for 404s
That makes it very hard to update the contents, if we want to roll out new keys, or update nar paths to point elsewhere
11:34:49
@emilazy:matrix.orgemily(probably low value for narinfos anyway considering how small they are and it not helping 404 latency?)11:42:30
@arianvp:matrix.orgArian
In reply to @flokli:matrix.org
That makes it very hard to update the contents, if we want to roll out new keys, or update nar paths to point elsewhere
okay so not for narinfos. But for NARs this seems totally safe right?
11:47:46
@emilazy:matrix.orgemilytheoretically, a drv hash does not uniquely identify the built results. however I heard that rebuilding something in the cache without changing drv hash is not something that can feasibly be done right now, so I assume the risk is very low11:48:30
@emilazy:matrix.orgemilymaybe there could be an issue if a NAR has legal problems and we need to take it down? but I have to assume Fastly has knobs to purge stuff manually if we have to11:48:53
@arianvp:matrix.orgArianFastly can purge yes11:49:04
@arianvp:matrix.orgArianhttps://github.com/NixOS/infra/pull/727/files hypothetical proposal12:30:41
@emilazy:matrix.orgemily heads up – https://github.com/NixOS/nixpkgs/pull/415566 we're expecting on the Darwin team end that we'll want to turn off x86_64-darwin on the jobsets sometime between after 26.05 branch-off and the release of 28.11, most likely after either 26.05 or 27.05 branch-off 13:06:07
@emilazy:matrix.orgemily(if it's around branch-off, then for the unstable branch only of course, until the end of the support period for the branched-off release)13:06:38
@vcunat:matrix.orgvcunat That will help staging* iterations quite a bit, I expect. 13:12:27
@emilazy:matrix.orgemilyindeed13:12:47
@vcunat:matrix.orgvcunatThough it's not very soon yet.13:12:58
@emilazy:matrix.orgemilyDarwin might stop being the bottleneck if we have twice the AArch64 build capacity and don't need any Intel :P13:13:06
@vcunat:matrix.orgvcunat(so relative bottlenecks might change in the meantime)13:13:18
@emilazy:matrix.orgemilyright. well, we could of course just decide to drop it any time, but start of 26.11 cycle seemed like the earliest natural point13:13:52
@emilazy:matrix.orgemilybased on the limited stats I can find and the resources trade-off I personally don't think it makes sense to wait longer than that, but didn't want to pre-commit to an exact point immediately13:14:44
@vcunat:matrix.orgvcunatBy natural point you mean Apple dropping support in macOS or rosetta2?13:15:36
@emilazy:matrix.orgemilyI assume that even just having it only be used on stable for six months will help, since there are fewer rebuilds there13:15:42
@emilazy:matrix.orgemily well, 26.11 release will be shortly after the first macOS version that doesn't run on Intel comes out. 27.11 will be shortly after the first macOS version that we can't use to build for x86_64-darwin comes out. 28.11 will be shortly after end of security support for the last macOS version to run on Intel. so all of those are "natural points" in a sense. 13:17:24
@emilazy:matrix.orgemilyand if we're dropping it for a release we should surely drop it right after branch-off for that release, to not waste half a year building binaries we won't ship13:17:57
@emilazy:matrix.orgemily (Darwin users do get defaulted to nixpkgs-unstable by the installer, so it would be a little weird for half a year, but it'd at least be a single-command fix for the final ~seven months of support.) 13:19:06
@emilazy:matrix.orgemily dropping it for 25.11 would be pretty abrupt (and also I'm still on x86_64-darwin so I imagine the expected value would be negative until I buy a new Mac soon given the difficulty it would pose for me helping out during staging cycles and so on :P) 13:20:35
@emilazy:matrix.orgemilywe could drop for the 26.05 release, but aligning it with the *.11 releases that match new macOS versions makes sense to me in terms of user expectations.13:21:09

Show newer messages


Back to Room ListRoom Version: 6