NixOS Binary Cache Self-Hosting - Public Room Timeline

	NixOS Binary Cache Self-Hosting	152 Members
	About how to host a very large-scale binary cache and more	53 Servers

Load older messages

Sender	Message	Time
11 Mar 2024
edef	turns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder	17:32:38
Shalok Shalom	In reply to @edef1c:matrix.org wherever migration paths do exist, they are generally not pre-tested on a system of this scale I would test the replacement side by side for a while not decide if its worth it until its proven	17:33:00
Shalok Shalom	incremental improvements do have huge benefits because of that	17:33:16
Shalok Shalom	they turn out to provide quicker feedback	17:33:30
edef	a lot of the bare basics i one might have expected weren't there, and building them took a fair bit of blood, sweat, and tears	17:33:34
Shalok Shalom	In reply to @edef1c:matrix.org turns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder but is it due to performance characteristics of Perl	17:33:56
edef	and yeah, yes to the above	17:33:58
Shalok Shalom	you mentioned the database and the backend	17:34:10
edef	but building a system that can run two things in parallel is still building a fair bit of new system	17:34:29
edef	In reply to @shalokshalom:dendrite.matrix.org but is it due to performance characteristics of Perl i think Perl's perf should nowhere be on the hot path	17:34:44
Shalok Shalom	nice	17:34:55
edef	for the cache itself, there isn't really any perl on the hot path	17:35:09
Shalok Shalom	In reply to @edef1c:matrix.org but building a system that can run two things in parallel is still building a fair bit of new system should be decoupled as far as possible	17:35:15
edef	we can't decouple so far that we're running two entirely parallel build pipelines	17:35:38
edef	we literally can't afford to run two build farms side by side	17:35:45
Shalok Shalom	Why is Hydra running new builds for the packages new to unstable only every 2 days?	17:35:48
Shalok Shalom	In reply to @edef1c:matrix.org we literally can't afford to run two build farms side by side yeah, that is understandable. I would be tempted to care for that for the testing period.	17:36:07
edef	that's more about build farm capacity than anything about perl, i think	17:36:11
edef	i don't have stats on the build farm utilisation, maybe the infra team does	17:36:31
Shalok Shalom	different opinions about that from different sides	17:36:36
Shalok Shalom	I heard the word 'clusterfuck' 😃	17:37:01
edef	on priors i'd lean towards "we just don't have enough compute" rather than "we're leaving it idle", but talk is cheap, i'd love to see some data	17:37:11
edef	like, broadly it's not very complicated, we (hopefully) have a Prometheus somewhere and a Grafana dash somewhere that can tell us this	17:37:39
Shalok Shalom	I know there is a Grafana	17:38:07
Shalok Shalom	didnt look into it too deep	17:38:12
Shalok Shalom	https://status.nixos.org/	17:38:23
edef	the build scheduler definitely has issues, and we could do a lot better	17:38:41
edef	i certainly have some prototype bits and pieces lying around for various parts of the stack, but none of that is productionised or tested at scale	17:39:14
edef	i would be quite surprised if we are below 50% utilisation and have the capacity to run a second, identical workload	17:39:41
Shalok Shalom	yeah, I see that	17:39:59

Show newer messages

Back to Room ListRoom Version: 10