!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

421 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.132 Servers

Load older messages


SenderMessageTime
8 May 2026
@jopejoe1:matrix.orgjopejoe1 changed their display name from jopejoe1 (4094@epvpn) to jopejoe1.08:43:04
@vcunat:matrix.orgVladimír Čunát OK, though kernels will probably update again anyway before staging-next gets to master. 08:45:40
@vcunat:matrix.orgVladimír Čunát * Sounds good, though kernels will probably update again anyway before staging-next gets to master. 08:46:18
@k900:0upti.meK900I'm trying to build my stuff against staging-next now08:55:07
@k900:0upti.meK900Cause I have no spoons and a lot of compute08:55:16
@hxr404:tchncs.de@hxr404:tchncs.de left the room.09:20:34
@arianvp:matrix.orgArianSo our storage cost in S3 is going down due to the GC. However our egress bandwidth cost are growing faster than the storage cost is shrinking. I think AI scraping is killing us perhaps 10:09:08
@arianvp:matrix.orgArian2 years ago egress bandwidth was 3.5k per month. . It's 6.2k per month now10:09:48
@arianvp:matrix.orgArianI think in one month we'll be paying more for egress bandwidth than storage10:10:12
@arianvp:matrix.orgArianThat sounds really off to me. 10:10:17
@arianvp:matrix.orgArian1000065300.jpg
Download 1000065300.jpg
10:11:44
@arianvp:matrix.orgArianRed line is egress bandwidth10:11:51
@arianvp:matrix.orgArianIt's hockey sticking 10:11:57
@jappie:jappie.devjappiedoes that mean that the amount of requests for stuff that isn't cached by fastly is growing (perhaps scrapers stumbling upon old derivations)? because I'd assume an uptick in users downloading new-ish derivations would mean more hits in fastly and no noticeable growth in S3 egress10:17:43
@hexa:lossy.networkhexaI think so10:18:23
@jappie:jappie.devjappiehow the hell are scrapers discovering old store paths / derivations... the URLs for those contain a hash right? I'd expect that to be really difficult and time-consuming (and useless) to scrape10:19:37
@arianvp:matrix.orgArianYou can dump all of hydras evaluations. Or run evaluations for all historical nixos commits yourself10:21:14
@leona:leona.isleonacan we determine that this is egress via fastly or is someone downloading them directly from AWS?10:21:46
@arianvp:matrix.orgArianDirectly through S3 is only possible when requester pays10:22:01
@arianvp:matrix.orgArianWe don't have anonymous auth enabled on our bucket. You need to provide your iam identity and it gets billed to the caller10:22:32
@arianvp:matrix.orgArianIt would be preferable if scrapers would scrape S3 directly as then it doesn't cost us10:22:55
@hexa:lossy.networkhexaone obvious fix would be to GC harder, provide fewer targets10:24:42
@arianvp:matrix.orgArianI'm wondering if I can somehow figure out from S3 the distribution of the age of objects being requested 10:30:52
@emilazy:matrix.orgemilyis it still true that Fastly doesn't cache paths for long even once built?11:10:55
@emilazy:matrix.orgemilyI forget what the conclusion of that discussion was (I know the focus was on missing paths because of the access pattern but presumably those are not what's causing these expenses)11:11:26
@emilazy:matrix.orgemily

I guess the issue is that if it's sufficiently spread out/high cardinality no per-path caching will help.

(though it seems surprising for scrapers to be going out of their way to find old stuff to query, I really doubt N versions of the same binaries are valuable?)

11:12:57
@arianvp:matrix.orgArianMissing paths don't generate bandwidth cost. They generate API call cost. Which is small11:33:00
@emilazy:matrix.orgemilyright11:58:09
@emilazy:matrix.orgemilybut I mean if a present path is being hammered, how long does Fastly cache that before going back to S3?11:58:33
@arianvp:matrix.orgArian24h i think 11:58:46

Show newer messages


Back to Room ListRoom Version: 6