NixOS Binary Cache Self-Hosting - Public Room Timeline

	NixOS Binary Cache Self-Hosting	168 Members
	About how to host a very large-scale binary cache and more	57 Servers

Load older messages

Sender	Message	Time
5 Mar 2024
nh2	In reply to @edef1c:matrix.org so you should be viewing the backend requests as (key, start offset, end offset) triples moreso than full fetches Will Fastly always chunk up the requests into small range requests, even if the user's Nix requests the whole NAR, or only if the end user requests a range?	04:41:58
edef	eg it is a humongous pain in the rear to collect the narinfos, we basically have custom tools to rapid-fire pipelined S3 fetches	04:42:04
edef	In reply to @nh2:matrix.org Will Fastly always chunk up the requests into small range requests, even if the user's Nix requests the whole NAR, or only if the end user requests a range? i don't recall right now, sorry	04:42:13
nh2	In reply to @edef1c:matrix.org i don't recall right now, sorry Because that could indeed inflate the IOPS, though Ceph has readaheads of configurable size, so it could be worked around that way	04:43:07
edef	Fastly segmented caching docs (https://docs.fastly.com/en/guides/segmented-caching#how-segmented-caching-works) When an end user makes a Range: request for a resource with Segmented Caching enabled and a cache miss occurs (that is, at least part of the range is not cached), Fastly will make the appropriate Range: requests back to origin. Segmented Caching will then ensure only the specific portions of the resource that have been requested by the end user (along with rounding based on object size) will be cached rather than the entire resource. Partial cache hits will result in having the cached portion served from cache and the missing pieces fetched from origin. (Requests for an entire resource would be treated as a byte Range: request from 0 to end of resource.)	04:43:09
edef	critical part being the final sentence	04:43:23
nh2	In reply to @edef1c:matrix.org critical part being the final sentence And the beginning of the paragraph: When an end user makes a `Range:` request for a resource ... This suggests "no range request by nix" => "no range request by Fastly to upstream"	04:45:12
nh2	So it should be a quite rare case	04:45:32
edef	out of ~1B 2xx responses, ~25% are 206 Partial Content responses, ~75% are 200 OKs	04:46:45
edef	so not that rare	04:47:06
nh2	In reply to @nh2:matrix.org That is nice and large, should make it easy. Sorry, I had misread that sentence: I thought you wrote "mean 16MiB, median 14MiB" for file size. But it was throughput.	04:47:48
nh2	In reply to @edef1c:matrix.org out of ~1B 2xx responses, ~25% are 206 Partial Content responses, ~75% are 200 OKs Interesting, I wonder why it's that many, at least in my nix use it is very rare to interrupt downloads	04:48:21
edef	we have users like. everywhere	04:48:35
nh2	edef: Do you know total number of files?	04:48:39
edef	i've seen countries i'd never even heard of in the fastly logs	04:48:48
edef	we have like ballpark a quarter billion store paths, and slightly fewer NARs than that (since complete NARs are semi content addressed)	04:49:44
edef	~800M S3 objects total basically, ~190M NARs	04:51:07
edef	(and sorry for keeping you waiting on histograms, i'm just a bit far into my uptime and it involves poking more stuff than i have brain for rn, i'm running half on autopilot)	04:59:15
nh2	In reply to @edef1c:matrix.org ~800M S3 objects total basically, ~190M NARs This part will likely be the hardest / most annoying one operationally. With 6 servers * 10 disks, each one will have ~13 M objects. When a disk fails, 13 M seeks will need to be done, which will take 37 hours. When a server fails, it'll be 10x as much, so 15 days to recovery. During that recovery time, only 1 more disk is allowed to fail with EC 6=4+2.	05:00:10
nh2	In reply to @edef1c:matrix.org (and sorry for keeping you waiting on histograms, i'm just a bit far into my uptime and it involves poking more stuff than i have brain for rn, i'm running half on autopilot) No problem, is not urgent, I should also really go to bed.	05:00:23
edef	In reply to @nh2:matrix.org This part will likely be the hardest / most annoying one operationally. With 6 servers * 10 disks, each one will have ~13 M objects. When a disk fails, 13 M seeks will need to be done, which will take 37 hours. When a server fails, it'll be 10x as much, so 15 days to recovery. During that recovery time, only 1 more disk is allowed to fail with EC 6=4+2. okay, that's terrifying	05:00:30
edef	but Glacier Deep Archive doesn't exactly break the bank, we can basically insure ourselves against data loss quite cheaply	05:01:04
edef	and, stupid question, but i assume you're keeping space for resilver capacity in your iops budget?	05:01:59
edef	O(objects) anything is kind of rough here	05:02:23
edef	like we're going to hit a billion objects pretty soon, the growth is slightly superlinear at minimum	05:03:04
nh2	In reply to @edef1c:matrix.org but Glacier Deep Archive doesn't exactly break the bank, we can basically insure ourselves against data loss quite cheaply Yes, it's only really concerning availability. For write-mostly backups, one can use higher EC redundancy, or tar/zip the files, which gets rid of the problem of many small files / seeks.	05:03:37
edef	also note that if we start doing more serious dedup, i'll be spraying your stuff with small random I/O	05:03:42
edef	In reply to @nh2:matrix.org Yes, it's only really concerning availability. For write-mostly backups, one can use higher EC redundancy, or tar/zip the files, which gets rid of the problem of many small files / seeks. yeah, we have a similar problem with Glacier	05:04:14
edef	where objects are costly but size is cheap	05:04:21
edef	so i intend to do aggregation into larger objects	05:04:53

Show newer messages

Back to Room ListRoom Version: 10