| 8 May 2026 |
Arian | You can dump all of hydras evaluations. Or run evaluations for all historical nixos commits yourself | 10:21:14 |
leona | can we determine that this is egress via fastly or is someone downloading them directly from AWS? | 10:21:46 |
Arian | Directly through S3 is only possible when requester pays | 10:22:01 |
Arian | We don't have anonymous auth enabled on our bucket. You need to provide your iam identity and it gets billed to the caller | 10:22:32 |
Arian | It would be preferable if scrapers would scrape S3 directly as then it doesn't cost us | 10:22:55 |
hexa | one obvious fix would be to GC harder, provide fewer targets | 10:24:42 |
Arian | I'm wondering if I can somehow figure out from S3 the distribution of the age of objects being requested | 10:30:52 |
emily | is it still true that Fastly doesn't cache paths for long even once built? | 11:10:55 |
emily | I forget what the conclusion of that discussion was (I know the focus was on missing paths because of the access pattern but presumably those are not what's causing these expenses) | 11:11:26 |
emily | I guess the issue is that if it's sufficiently spread out/high cardinality no per-path caching will help.
(though it seems surprising for scrapers to be going out of their way to find old stuff to query, I really doubt N versions of the same binaries are valuable?)
| 11:12:57 |
Arian | Missing paths don't generate bandwidth cost. They generate API call cost. Which is small | 11:33:00 |
emily | right | 11:58:09 |
emily | but I mean if a present path is being hammered, how long does Fastly cache that before going back to S3? | 11:58:33 |
Arian | 24h i think | 11:58:46 |
emily | IIRC you said it was not that long? | 11:58:53 |
emily | yeah | 11:58:57 |
emily | so I wonder if just increasing that a ton would help? | 11:59:10 |
emily | I don't know how much Fastly will cache before evicting things though. but at least there's definitely no reason to evict something just because it's been a day :) | 11:59:59 |
hexa | do you think we can get better caching than what fastly currently provides? | 13:13:50 |
emily | (not sure if you're asking me but) if it expires every 24 hours then a bot that hits a bunch of store paths every 24 hours and then repeats causes costs every day vs. potentially getting cached indefinitely if we tell Fastly there's no need to expire known store paths right?
(but obviously it's just throwing things at the wall unless it's known what the access pattern looks like. still I imagine it's good in general for e.g. the latest stable installer ISO to not get redownloaded from S3 every day?)
| 13:17:15 |
hexa |  Download | 13:19:25 |
hexa |  Download | 13:20:08 |
emily | but it's precisely that 5% that must be causing ^ right? 🤔 | 13:20:50 |
hexa | at the same time | 13:21:00 |
hexa |  Download | 13:21:02 |
hexa |  Download | 13:21:22 |
emily | Nix probably counts as "other bots"? | 13:21:57 |
emily | 0 DDoS requests mitigated is a fun figure | 13:22:14 |
hexa | I would imagine it does, since it doesn't advertise as a browser | 13:26:04 |
| 9 May 2026 |
hexa | Arian what's blocking https://github.com/NixOS/infra/pull/728 | 12:17:45 |