!CcTBuBritXGywOEGWJ:matrix.org

NixOS Binary Cache Self-Hosting

168 Members
About how to host a very large-scale binary cache and more57 Servers

Load older messages


SenderMessageTime
5 Mar 2024
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
so you should be viewing the backend requests as (key, start offset, end offset) triples moreso than full fetches
Will Fastly always chunk up the requests into small range requests, even if the user's Nix requests the whole NAR, or only if the end user requests a range?
04:41:58
@edef1c:matrix.orgedefeg it is a humongous pain in the rear to collect the narinfos, we basically have custom tools to rapid-fire pipelined S3 fetches04:42:04
@edef1c:matrix.orgedef
In reply to @nh2:matrix.org
Will Fastly always chunk up the requests into small range requests, even if the user's Nix requests the whole NAR, or only if the end user requests a range?
i don't recall right now, sorry
04:42:13
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
i don't recall right now, sorry
Because that could indeed inflate the IOPS, though Ceph has readaheads of configurable size, so it could be worked around that way
04:43:07
@edef1c:matrix.orgedef

Fastly segmented caching docs (https://docs.fastly.com/en/guides/segmented-caching#how-segmented-caching-works)

When an end user makes a Range: request for a resource with Segmented Caching enabled and a cache miss occurs (that is, at least part of the range is not cached), Fastly will make the appropriate Range: requests back to origin. Segmented Caching will then ensure only the specific portions of the resource that have been requested by the end user (along with rounding based on object size) will be cached rather than the entire resource. Partial cache hits will result in having the cached portion served from cache and the missing pieces fetched from origin. (Requests for an entire resource would be treated as a byte Range: request from 0 to end of resource.)

04:43:09
@edef1c:matrix.orgedefcritical part being the final sentence04:43:23
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
critical part being the final sentence

And the beginning of the paragraph:

When an end user makes a Range: request for a resource ...

This suggests "no range request by nix" => "no range request by Fastly to upstream"

04:45:12
@nh2:matrix.orgnh2So it should be a quite rare case04:45:32
@edef1c:matrix.orgedefout of ~1B 2xx responses, ~25% are 206 Partial Content responses, ~75% are 200 OKs04:46:45
@edef1c:matrix.orgedefso not that rare04:47:06
@nh2:matrix.orgnh2
In reply to @nh2:matrix.org
That is nice and large, should make it easy.
Sorry, I had misread that sentence: I thought you wrote "mean 16MiB, median 14MiB" for file size. But it was throughput.
04:47:48
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
out of ~1B 2xx responses, ~25% are 206 Partial Content responses, ~75% are 200 OKs
Interesting, I wonder why it's that many, at least in my nix use it is very rare to interrupt downloads
04:48:21
@edef1c:matrix.orgedefwe have users like. everywhere04:48:35
@nh2:matrix.orgnh2 edef: Do you know total number of files? 04:48:39
@edef1c:matrix.orgedefi've seen countries i'd never even heard of in the fastly logs04:48:48
@edef1c:matrix.orgedefwe have like ballpark a quarter billion store paths, and slightly fewer NARs than that (since complete NARs are semi content addressed)04:49:44
@edef1c:matrix.orgedef~800M S3 objects total basically, ~190M NARs04:51:07
@edef1c:matrix.orgedef(and sorry for keeping you waiting on histograms, i'm just a bit far into my uptime and it involves poking more stuff than i have brain for rn, i'm running half on autopilot)04:59:15
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
~800M S3 objects total basically, ~190M NARs

This part will likely be the hardest / most annoying one operationally.
With 6 servers * 10 disks, each one will have ~13 M objects.

  • When a disk fails, 13 M seeks will need to be done, which will take 37 hours.
  • When a server fails, it'll be 10x as much, so 15 days to recovery.

During that recovery time, only 1 more disk is allowed to fail with EC 6=4+2.

05:00:10
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
(and sorry for keeping you waiting on histograms, i'm just a bit far into my uptime and it involves poking more stuff than i have brain for rn, i'm running half on autopilot)
No problem, is not urgent, I should also really go to bed.
05:00:23
@edef1c:matrix.orgedef
In reply to @nh2:matrix.org

This part will likely be the hardest / most annoying one operationally.
With 6 servers * 10 disks, each one will have ~13 M objects.

  • When a disk fails, 13 M seeks will need to be done, which will take 37 hours.
  • When a server fails, it'll be 10x as much, so 15 days to recovery.

During that recovery time, only 1 more disk is allowed to fail with EC 6=4+2.

okay, that's terrifying
05:00:30
@edef1c:matrix.orgedefbut Glacier Deep Archive doesn't exactly break the bank, we can basically insure ourselves against data loss quite cheaply05:01:04
@edef1c:matrix.orgedefand, stupid question, but i assume you're keeping space for resilver capacity in your iops budget?05:01:59
@edef1c:matrix.orgedefO(objects) anything is kind of rough here05:02:23
@edef1c:matrix.orgedeflike we're going to hit a billion objects pretty soon, the growth is slightly superlinear at minimum05:03:04
@nh2:matrix.orgnh2
In reply to @edef1c:matrix.org
but Glacier Deep Archive doesn't exactly break the bank, we can basically insure ourselves against data loss quite cheaply
Yes, it's only really concerning availability. For write-mostly backups, one can use higher EC redundancy, or tar/zip the files, which gets rid of the problem of many small files / seeks.
05:03:37
@edef1c:matrix.orgedefalso note that if we start doing more serious dedup, i'll be spraying your stuff with small random I/O05:03:42
@edef1c:matrix.orgedef
In reply to @nh2:matrix.org
Yes, it's only really concerning availability. For write-mostly backups, one can use higher EC redundancy, or tar/zip the files, which gets rid of the problem of many small files / seeks.
yeah, we have a similar problem with Glacier
05:04:14
@edef1c:matrix.orgedefwhere objects are costly but size is cheap05:04:21
@edef1c:matrix.orgedefso i intend to do aggregation into larger objects05:04:53

Show newer messages


Back to Room ListRoom Version: 10