!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

421 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.132 Servers

Load older messages


SenderMessageTime
8 May 2026
@jappie:jappie.devjappiedoes that mean that the amount of requests for stuff that isn't cached by fastly is growing (perhaps scrapers stumbling upon old derivations)? because I'd assume an uptick in users downloading new-ish derivations would mean more hits in fastly and no noticeable growth in S3 egress10:17:43
@hexa:lossy.networkhexaI think so10:18:23
@jappie:jappie.devjappiehow the hell are scrapers discovering old store paths / derivations... the URLs for those contain a hash right? I'd expect that to be really difficult and time-consuming (and useless) to scrape10:19:37
@arianvp:matrix.orgArianYou can dump all of hydras evaluations. Or run evaluations for all historical nixos commits yourself10:21:14
@leona:leona.isleonacan we determine that this is egress via fastly or is someone downloading them directly from AWS?10:21:46
@arianvp:matrix.orgArianDirectly through S3 is only possible when requester pays10:22:01
@arianvp:matrix.orgArianWe don't have anonymous auth enabled on our bucket. You need to provide your iam identity and it gets billed to the caller10:22:32
@arianvp:matrix.orgArianIt would be preferable if scrapers would scrape S3 directly as then it doesn't cost us10:22:55
@hexa:lossy.networkhexaone obvious fix would be to GC harder, provide fewer targets10:24:42
@arianvp:matrix.orgArianI'm wondering if I can somehow figure out from S3 the distribution of the age of objects being requested 10:30:52
@emilazy:matrix.orgemilyis it still true that Fastly doesn't cache paths for long even once built?11:10:55
@emilazy:matrix.orgemilyI forget what the conclusion of that discussion was (I know the focus was on missing paths because of the access pattern but presumably those are not what's causing these expenses)11:11:26
@emilazy:matrix.orgemily

I guess the issue is that if it's sufficiently spread out/high cardinality no per-path caching will help.

(though it seems surprising for scrapers to be going out of their way to find old stuff to query, I really doubt N versions of the same binaries are valuable?)

11:12:57
@arianvp:matrix.orgArianMissing paths don't generate bandwidth cost. They generate API call cost. Which is small11:33:00
@emilazy:matrix.orgemilyright11:58:09
@emilazy:matrix.orgemilybut I mean if a present path is being hammered, how long does Fastly cache that before going back to S3?11:58:33
@arianvp:matrix.orgArian24h i think 11:58:46
@emilazy:matrix.orgemilyIIRC you said it was not that long?11:58:53
@emilazy:matrix.orgemilyyeah11:58:57
@emilazy:matrix.orgemilyso I wonder if just increasing that a ton would help?11:59:10
@emilazy:matrix.orgemilyI don't know how much Fastly will cache before evicting things though. but at least there's definitely no reason to evict something just because it's been a day :)11:59:59
@hexa:lossy.networkhexado you think we can get better caching than what fastly currently provides?13:13:50
@emilazy:matrix.orgemily

(not sure if you're asking me but) if it expires every 24 hours then a bot that hits a bunch of store paths every 24 hours and then repeats causes costs every day vs. potentially getting cached indefinitely if we tell Fastly there's no need to expire known store paths right?

(but obviously it's just throwing things at the wall unless it's known what the access pattern looks like. still I imagine it's good in general for e.g. the latest stable installer ISO to not get redownloaded from S3 every day?)

13:17:15
@hexa:lossy.networkhexa
Download
13:19:25
@hexa:lossy.networkhexa
Download
13:20:08
@emilazy:matrix.orgemilybut it's precisely that 5% that must be causing ^ right? 🤔13:20:50
@hexa:lossy.networkhexaat the same time13:21:00
@hexa:lossy.networkhexa
Download
13:21:02
@hexa:lossy.networkhexa
Download
13:21:22
@emilazy:matrix.orgemilyNix probably counts as "other bots"?13:21:57

Show newer messages


Back to Room ListRoom Version: 6