NixOS Binary Cache Self-Hosting | 159 Members | |
| About how to host a very large-scale binary cache and more | 54 Servers |
| Sender | Message | Time |
|---|---|---|
| 5 Mar 2024 | ||
| so we can tune that for however much compute we want to throw at it in parallel | 03:35:31 | |
| i can run some numbers wrt the best bang-per-buck there but not right this second | 03:36:12 | |
| basically depends on what the supply curve for EC2 spot compute looks like | 03:36:38 | |
| For Ceph hosting, do we know what the IOPS of cache.nixos.org are, just to see if some basic small cluster on HDDs could handle it? | 03:39:18 | |
| presumably you want backend I/O, ie to the S3 bucket? | 03:40:25 | |
| yes, that would be the equivalent of what would hit the disks | 03:41:04 | |
| easy stats: over the last 24h we've served 2.1TiB from all our S3 buckets, uploaded 491G, in ~30M requests | 03:47:32 | |
| we're serving like 375Mbit/s to Fastly in the peak minute on a day chosen by Fair Dice Roll™ | 04:03:53 | |
| not sure how to meaningfully turn these things into iops numbers just because that depends on various factors | 04:05:27 | |
| clickhouse is refusing to deal with S3 wildcards for some reason and i haven't quite chased down why yet | 04:06:41 | |
| i'm just taking a request that completed in that minute to have fully executed in that minute but i think that shakes out to a slightly upwards biased estimator if anything | 04:08:29 | |
| okay i just need to upgrade clickhouse on the EC2 data box i think | 04:09:57 | |
| * we're serving like 375Mbit/s of compressed NARs to Fastly in the peak minute on a day chosen by Fair Dice Roll™ | 04:10:45 | |
| i'm focusing on the NAR serving because that's the actual meat of it, the narinfos are only like 90G of stuff | 04:11:17 | |
| we also have a few other file types but they're mostly pretty marginal | 04:11:52 | |
| 04:12:00 | |
| ^ that yields an empty result set if applied over the S3 inventory | 04:12:33 | |
| debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nar-index iirc but we don't have very much of those | 04:13:27 | |
| * debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nix-index iirc but we don't have very much of those | 04:13:45 | |
| basically at peak we're serving like a gigabit of NARs | 04:15:22 | |
| * basically at peak we're serving like a gigabit per second of NARs | 04:15:34 | |
| that is, the following query yields 7400044328 bytes/min ≈ 1 Gbit/s
| 04:17:09 | |
In reply to @edef1c:matrix.org Thanks! So mean is 24 MB/s, 350 req/s. Probably a good amount of that can also be cached away from the HDDs as many people will be likely requesting latest nixos-* branches, and fewer people older, pinned branches. | 04:19:30 | |
| that number is coming from the S3 dash, and is over all nixos.org buckets | 04:20:38 | |
| but it's a 24h sample, my other numbers are coming from a few months worth of data | 04:21:11 | |
| edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph is large amounts of small files. | 04:21:14 | |
| * edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph on HDDs is large amounts of small files. | 04:21:29 | |
| sure, i can draw you some histograms | 04:21:59 | |
| i can tell you up front that our biggest source of small files is the narinfos though | 04:22:20 | |
| but there's only like 90G of that so we can serve that from SSD quite easily | 04:23:08 | |