NixOS Archivists - Public Room Timeline

	NixOS Archivists	64 Members
	Taking care of NixOS historical build artifacts and GC. Meeting notes: https://pad.lassul.us/nixos-cache-gc For self-hosting, see #binary-cache-selfhosting:nixos.org	19 Servers

Load older messages

Sender	Message	Time
4 Mar 2024
nh2	* Regarding the current dedup effort, can it perform the dedup on the source (AWS) side, while storing the content-addressed objects on the remote side (like `bup`/`bupstash` does), so that the outgoing traffic deduped (cheaper egress), but storage operations are done outside of AWS (e.g. free on self-hosted Hetzner)?	12:47:21
nh2	* Regarding the current dedup effort, can it perform the dedup on the source (AWS) side, while storing the content-addressed objects on the remote side (like `bup`/`bupstash` do), so that the outgoing traffic deduped (cheaper egress), but storage operations are done outside of AWS (e.g. free on self-hosted Hetzner)?	12:47:32
flokli	Yes	12:57:06
flokli	We could be reading through the NARs in the AWS bucket, decompress and CDC on a EC2 instance for example, then insert chunks into some hetzner object storage. however, xz decompression takes quite a big amount of CPU, so it might be quite slow, compared to snowballing out all, and running that part at Hetzner too.	12:58:55
nh2	flokli: How much of the code to do that does already exist?	13:00:48
flokli	We have code essentially doing everything except actually persisting the chunks (only doing bookkeeping). That was used to test dedup ratios while we very chunking parameters	13:02:01
flokli	* We have code essentially doing everything except actually persisting the chunks (only doing bookkeeping). That was used to test dedup ratios while we vary chunking parameters	13:02:38
flokli	I'm also writing some code teaching tvix-castore how to use object storage. The internal model is the same, so it'd just be a matter of plugging some of this stuff together	13:03:40
nh2	flokli: Is there a current write-up of the expected dedup ratios?	13:04:50
flokli	All that's on discourse, there's also a pad with our meeting notes linked somewhere there	13:06:36
flokli	(ah, channel topic here too)	13:06:44
nh2	flokli: I have some troubles getting the latest numbers out of that pad. In https://pad.lassul.us/nixos-cache-gc#Day-2023-11-07 it suggests dedup to 20% of original size, in https://pad.lassul.us/nixos-cache-gc#Day-2024-01-16 it says `we got 30-40% better once we got enough data for the deduper`, does that mean 14%?	13:17:52
flokli	We tested with "ingesting" 3 channel bumps, and the total size of chunks (which were zstd-compressed) was 20% less than the xz-compressed NAR files. That number is gonna go up the more things we ingest, as there might be more things that already exist.	13:28:31
flokli	We didn't play too much with chunking params, as popping the xz layer from the NARs is very very slow. took like 12h to read through one channel bump on the machine we had :-/	13:31:16
nh2	flokli: What kind of machine was it? From my memory XZ decompression is ~60 MB/s/core	13:47:16
nh2	* flokli: What kind of machine was it? From my memory XZ decompression is ~60 MB/s/core (where 1 xz invocation does not scale beyond 1 core)	13:48:09
flokli	It'd a r5a.2xlarge, but I'm not 100% it was this all the time	13:52:15
flokli	* It's a r5a.2xlarge, but I'm not 100% it was this all the time	13:52:22
flokli	You can probably correlate the meeting notes with the terraform config in the nixos infra repo	13:52:49
flokli	Ah, it's been that instance size all the time, we only bumped the disk at some point	13:55:46
flokli	The ingestion was running on multiple NARs in parallel, so we were saturating all cores	13:56:04
nh2	In reply to @nh2:matrix.org flokli: What kind of machine was it? From my memory XZ decompression is ~60 MB/s/core (where 1 xz invocation does not scale beyond 1 core) I just did a quick test on `620lqprbzy4pgd2x4zkg7n19rfd59ap7-chromium-unwrapped-108.0.5359.98`, there `xz -d` on output of `tar \| xz -9` is 120 MB/s, saturating 1 core.	13:59:09
nh2	In reply to @nh2:matrix.org flokli: What kind of machine was it? From my memory XZ decompression is ~60 MB/s/core (where 1 xz invocation does not scale beyond 1 core) * I just did a quick test on `620lqprbzy4pgd2x4zkg7n19rfd59ap7-chromium-unwrapped-108.0.5359.98` on `AMD 5950X`, there `xz -d` on output of `tar \| xz -9` is 120 MB/s, saturating 1 core.	13:59:55
flokli	I think the thing that took a horrible long time was some chromium debuginfo outputs	14:05:25
flokli	I don't have the exact store path anymore	14:05:32
flokli	but yeah, overall xz is the main timesink	14:05:44
flokli	Not sure how much any of this matters though. If we want to start serving NAR files from elsewhere, we need to check Nix doesn't do stupid things if FileSize and FileHash from the .narinfo do differ with what we serve	14:07:20
flokli	And we need to send it with the same compression algo too, at least until we'd feel confident updating the NARInfo files.	14:08:10
flokli	And this is all work on top of doing all the work to determine which NAR files to delete / deep-freeze.	14:08:37
edef	In reply to @flokli:matrix.org And this is all work on top of doing all the work to determine which NAR files to delete / deep-freeze. i'm pretty close to done on that to be clear, i'm just kind of low-motivation over the past few days	14:55:34

Show newer messages

Back to Room ListRoom Version: 10