11 Mar 2024
@edef1c:matrix.orgedefturns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder17:32:38
@shalokshalom:dendrite.matrix.orgShalok Shalom
In reply to @edef1c:matrix.org
wherever migration paths do exist, they are generally not pre-tested on a system of this scale
I would test the replacement side by side for a while
not decide if its worth it until its proven
@shalokshalom:dendrite.matrix.orgShalok Shalomincremental improvements do have huge benefits because of that17:33:16
@shalokshalom:dendrite.matrix.orgShalok Shalomthey turn out to provide quicker feedback17:33:30
@edef1c:matrix.orgedefa lot of the bare basics i one might have expected weren't there, and building them took a fair bit of blood, sweat, and tears17:33:34
@shalokshalom:dendrite.matrix.orgShalok Shalom
In reply to @edef1c:matrix.org
turns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder
but is it due to performance characteristics of Perl
@edef1c:matrix.orgedefand yeah, yes to the above17:33:58
@shalokshalom:dendrite.matrix.orgShalok Shalomyou mentioned the database and the backend17:34:10
@edef1c:matrix.orgedefbut building a system that can run two things in parallel is still building a fair bit of new system17:34:29
In reply to @shalokshalom:dendrite.matrix.org
but is it due to performance characteristics of Perl
i think Perl's perf should nowhere be on the hot path
@shalokshalom:dendrite.matrix.orgShalok Shalomnice17:34:55
@edef1c:matrix.orgedeffor the cache itself, there isn't really any perl on the hot path17:35:09
@shalokshalom:dendrite.matrix.orgShalok Shalom
In reply to @edef1c:matrix.org
but building a system that can run two things in parallel is still building a fair bit of new system
should be decoupled as far as possible
@edef1c:matrix.orgedefwe can't decouple so far that we're running two entirely parallel build pipelines17:35:38
@edef1c:matrix.orgedefwe literally can't afford to run two build farms side by side17:35:45
@shalokshalom:dendrite.matrix.orgShalok ShalomWhy is Hydra running new builds for the packages new to unstable only every 2 days?17:35:48
@shalokshalom:dendrite.matrix.orgShalok Shalom
In reply to @edef1c:matrix.org
we literally can't afford to run two build farms side by side
yeah, that is understandable. I would be tempted to care for that for the testing period.
@edef1c:matrix.orgedefthat's more about build farm capacity than anything about perl, i think17:36:11
@edef1c:matrix.orgedefi don't have stats on the build farm utilisation, maybe the infra team does17:36:31
@shalokshalom:dendrite.matrix.orgShalok Shalomdifferent opinions about that from different sides 17:36:36
@shalokshalom:dendrite.matrix.orgShalok ShalomI heard the word 'clusterfuck' 😃17:37:01
@edef1c:matrix.orgedefon priors i'd lean towards "we just don't have enough compute" rather than "we're leaving it idle", but talk is cheap, i'd love to see some data17:37:11
@edef1c:matrix.orgedeflike, broadly it's not very complicated, we (hopefully) have a Prometheus somewhere and a Grafana dash somewhere that can tell us this17:37:39
@shalokshalom:dendrite.matrix.orgShalok ShalomI know there is a Grafana17:38:07
@shalokshalom:dendrite.matrix.orgShalok Shalomdidnt look into it too deep17:38:12
@shalokshalom:dendrite.matrix.orgShalok Shalomhttps://status.nixos.org/17:38:23
@edef1c:matrix.orgedefthe build scheduler definitely has issues, and we could do a lot better17:38:41
@edef1c:matrix.orgedefi certainly have some prototype bits and pieces lying around for various parts of the stack, but none of that is productionised or tested at scale17:39:14
@edef1c:matrix.orgedefi would be quite surprised if we are below 50% utilisation and have the capacity to run a second, identical workload17:39:41
@shalokshalom:dendrite.matrix.orgShalok Shalomyeah, I see that17:39:59

