11 Mar 2024 |
edef | turns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder | 17:32:38 |
Shalok Shalom | In reply to @edef1c:matrix.org wherever migration paths do exist, they are generally not pre-tested on a system of this scale I would test the replacement side by side for a while not decide if its worth it until its proven | 17:33:00 |
Shalok Shalom | incremental improvements do have huge benefits because of that | 17:33:16 |
Shalok Shalom | they turn out to provide quicker feedback | 17:33:30 |
edef | a lot of the bare basics i one might have expected weren't there, and building them took a fair bit of blood, sweat, and tears | 17:33:34 |
Shalok Shalom | In reply to @edef1c:matrix.org turns out that going from a few TB to 500TB, and a few million store paths to a quarter billion, a lot of things get weirder but is it due to performance characteristics of Perl | 17:33:56 |
edef | and yeah, yes to the above | 17:33:58 |
Shalok Shalom | you mentioned the database and the backend | 17:34:10 |
edef | but building a system that can run two things in parallel is still building a fair bit of new system | 17:34:29 |
edef | In reply to @shalokshalom:dendrite.matrix.org but is it due to performance characteristics of Perl i think Perl's perf should nowhere be on the hot path | 17:34:44 |
Shalok Shalom | nice | 17:34:55 |
edef | for the cache itself, there isn't really any perl on the hot path | 17:35:09 |
Shalok Shalom | In reply to @edef1c:matrix.org but building a system that can run two things in parallel is still building a fair bit of new system should be decoupled as far as possible | 17:35:15 |
edef | we can't decouple so far that we're running two entirely parallel build pipelines | 17:35:38 |
edef | we literally can't afford to run two build farms side by side | 17:35:45 |
Shalok Shalom | Why is Hydra running new builds for the packages new to unstable only every 2 days? | 17:35:48 |
Shalok Shalom | In reply to @edef1c:matrix.org we literally can't afford to run two build farms side by side yeah, that is understandable. I would be tempted to care for that for the testing period. | 17:36:07 |
edef | that's more about build farm capacity than anything about perl, i think | 17:36:11 |
edef | i don't have stats on the build farm utilisation, maybe the infra team does | 17:36:31 |
Shalok Shalom | different opinions about that from different sides | 17:36:36 |
Shalok Shalom | I heard the word 'clusterfuck' 😃 | 17:37:01 |
edef | on priors i'd lean towards "we just don't have enough compute" rather than "we're leaving it idle", but talk is cheap, i'd love to see some data | 17:37:11 |
edef | like, broadly it's not very complicated, we (hopefully) have a Prometheus somewhere and a Grafana dash somewhere that can tell us this | 17:37:39 |
Shalok Shalom | I know there is a Grafana | 17:38:07 |
Shalok Shalom | didnt look into it too deep | 17:38:12 |
Shalok Shalom | https://status.nixos.org/ | 17:38:23 |
edef | the build scheduler definitely has issues, and we could do a lot better | 17:38:41 |
edef | i certainly have some prototype bits and pieces lying around for various parts of the stack, but none of that is productionised or tested at scale | 17:39:14 |
edef | i would be quite surprised if we are below 50% utilisation and have the capacity to run a second, identical workload | 17:39:41 |
Shalok Shalom | yeah, I see that | 17:39:59 |