NixOS Binary Cache Self-Hosting | 169 Members | |
| About how to host a very large-scale binary cache and more | 58 Servers |
| Sender | Message | Time |
|---|---|---|
| 5 Mar 2024 | ||
| we're serving like 375Mbit/s to Fastly in the peak minute on a day chosen by Fair Dice Roll™ | 04:03:53 | |
| not sure how to meaningfully turn these things into iops numbers just because that depends on various factors | 04:05:27 | |
| clickhouse is refusing to deal with S3 wildcards for some reason and i haven't quite chased down why yet | 04:06:41 | |
| i'm just taking a request that completed in that minute to have fully executed in that minute but i think that shakes out to a slightly upwards biased estimator if anything | 04:08:29 | |
| okay i just need to upgrade clickhouse on the EC2 data box i think | 04:09:57 | |
| * we're serving like 375Mbit/s of compressed NARs to Fastly in the peak minute on a day chosen by Fair Dice Roll™ | 04:10:45 | |
| i'm focusing on the NAR serving because that's the actual meat of it, the narinfos are only like 90G of stuff | 04:11:17 | |
| we also have a few other file types but they're mostly pretty marginal | 04:11:52 | |
| 04:12:00 | |
| ^ that yields an empty result set if applied over the S3 inventory | 04:12:33 | |
| debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nar-index iirc but we don't have very much of those | 04:13:27 | |
| * debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nix-index iirc but we don't have very much of those | 04:13:45 | |
| basically at peak we're serving like a gigabit of NARs | 04:15:22 | |
| * basically at peak we're serving like a gigabit per second of NARs | 04:15:34 | |
| that is, the following query yields 7400044328 bytes/min ≈ 1 Gbit/s
| 04:17:09 | |
In reply to @edef1c:matrix.org Thanks! So mean is 24 MB/s, 350 req/s. Probably a good amount of that can also be cached away from the HDDs as many people will be likely requesting latest nixos-* branches, and fewer people older, pinned branches. | 04:19:30 | |
| that number is coming from the S3 dash, and is over all nixos.org buckets | 04:20:38 | |
| but it's a 24h sample, my other numbers are coming from a few months worth of data | 04:21:11 | |
| edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph is large amounts of small files. | 04:21:14 | |
| * edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph on HDDs is large amounts of small files. | 04:21:29 | |
| sure, i can draw you some histograms | 04:21:59 | |
| i can tell you up front that our biggest source of small files is the narinfos though | 04:22:20 | |
| but there's only like 90G of that so we can serve that from SSD quite easily | 04:23:08 | |
| like, we can solve small file serving in a lot of easy ways i think | 04:24:02 | |
In reply to @nh2:matrix.orgfrom my larger sample, mean ~16MiB/s, median ~14MiB/s | 04:27:05 | |
In reply to @nh2:matrix.orgwe have ~47M NARs under 4k | 04:29:15 | |
| how much of those we're serving is a harder question, fastly's segmented caching makes it a bit weird to query that without a big painful join | 04:30:41 | |
| will pull you some histograms in a bit, first going to take a break to eat/recaffeinate | 04:31:39 | |
In reply to @edef1c:matrix.orgThat is nice and large, should make it easy. | 04:36:00 | |
| note that i've given you zero information about I/O sizes so far | 04:36:36 | |