NixOS Binary Cache Self-Hosting | 171 Members | |
| About how to host a very large-scale binary cache and more | 58 Servers |
| Sender | Message | Time |
|---|---|---|
| 5 Mar 2024 | ||
In reply to @edef1c:matrix.org Thanks! So mean is 24 MB/s, 350 req/s. Probably a good amount of that can also be cached away from the HDDs as many people will be likely requesting latest nixos-* branches, and fewer people older, pinned branches. | 04:19:30 | |
| that number is coming from the S3 dash, and is over all nixos.org buckets | 04:20:38 | |
| but it's a 24h sample, my other numbers are coming from a few months worth of data | 04:21:11 | |
| edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph is large amounts of small files. | 04:21:14 | |
| * edef: Can you query the number of files / the distribution/histogram of their sizes? A weakness of Ceph on HDDs is large amounts of small files. | 04:21:29 | |
| sure, i can draw you some histograms | 04:21:59 | |
| i can tell you up front that our biggest source of small files is the narinfos though | 04:22:20 | |
| but there's only like 90G of that so we can serve that from SSD quite easily | 04:23:08 | |
| like, we can solve small file serving in a lot of easy ways i think | 04:24:02 | |
In reply to @nh2:matrix.orgfrom my larger sample, mean ~16MiB/s, median ~14MiB/s | 04:27:05 | |
In reply to @nh2:matrix.orgwe have ~47M NARs under 4k | 04:29:15 | |
| how much of those we're serving is a harder question, fastly's segmented caching makes it a bit weird to query that without a big painful join | 04:30:41 | |
| will pull you some histograms in a bit, first going to take a break to eat/recaffeinate | 04:31:39 | |
In reply to @edef1c:matrix.orgThat is nice and large, should make it easy. | 04:36:00 | |
| note that i've given you zero information about I/O sizes so far | 04:36:36 | |
In reply to @edef1c:matrix.orgWouldn't Nix users usually download the whole NAR, unless they abort the download? | 04:37:10 | |
| yes and no | 04:37:25 | |
| in that more recent Nix versions will do download resumption, and Fastly will do range requests to the backend ("segmented caching" is your keyword here) | 04:37:52 | |
| so you should be viewing the backend requests as (key, start offset, end offset) triples moreso than full fetches | 04:39:53 | |
In reply to @edef1c:matrix.org I will likely not need serving numbers, the trouble with Ceph is usally storing the small files, because its maintenance operations (integrity "scrubs", "recovery" balancing on disk failure) are O(objects = files = seeks). (For systems where the stored is bigger than what's served per month, which is the case here at 2TB/day.) | 04:40:41 | |
| right | 04:40:56 | |
| so on S3 small objects are also costly | 04:41:10 | |
In reply to @edef1c:matrix.orgWill Fastly always chunk up the requests into small range requests, even if the user's Nix requests the whole NAR, or only if the end user requests a range? | 04:41:58 | |
| eg it is a humongous pain in the rear to collect the narinfos, we basically have custom tools to rapid-fire pipelined S3 fetches | 04:42:04 | |
In reply to @nh2:matrix.orgi don't recall right now, sorry | 04:42:13 | |
In reply to @edef1c:matrix.orgBecause that could indeed inflate the IOPS, though Ceph has readaheads of configurable size, so it could be worked around that way | 04:43:07 | |
| Fastly segmented caching docs (https://docs.fastly.com/en/guides/segmented-caching#how-segmented-caching-works)
| 04:43:09 | |
| critical part being the final sentence | 04:43:23 | |
In reply to @edef1c:matrix.org And the beginning of the paragraph:
This suggests "no range request by nix" => "no range request by Fastly to upstream" | 04:45:12 | |
| So it should be a quite rare case | 04:45:32 | |