| 31 Aug 2021 |
andi- | In reply to @vcunat:matrix.org I lowered them now. nixos-unstable is still waiting for the first bump with new openssl. Anyone able to up the shares again? We've tons of idle x86_64 capacity but due to the 100 shares limit none of the jobs are actually being schedule there... At this rate it'll take way over a week what usually takes ~8hs. There are also 37k jobs in the queue. Not sure why they aren't being picked up. Probably also out of shares on those jobsets? | 23:35:54 |
| 1 Sep 2021 |
hexa | (this is about the systemd-v249 jobset) | 00:54:08 |
Vladimír Čunát | staging-next-21.05 (i.e. the first 21.05 with secure openssl) has over 30k queued x86_64-linux jobs. So I can't see what you mean about the idle capacity. | 04:30:59 |
Vladimír Čunát | I also thought the shares only affect relative priorities, i.e. if there's free capacity, I expect the matching jobs to get scheduled regardless of shares. | 04:32:21 |
Vladimír Čunát | (OK, the "runnable" metrics are better for judging the scheduling at a particular moment, but even that one's in thousands.) | 04:36:47 |
andi- | In reply to @vcunat:matrix.org
staging-next-21.05 (i.e. the first 21.05 with secure openssl) has over 30k queued x86_64-linux jobs. So I can't see what you mean about the idle capacity. I looked at the machines dashboard of hydra and there were ~4 x86_64-linux machines for about 30min that didn't execute a single job | 11:58:17 |
Vladimír Čunát | In reply to @andi:kack.it I looked at the machines dashboard of hydra and there were ~4 x86_64-linux machines for about 30min that didn't execute a single job Maybe they're stuck. We certainly have some macs that haven't made a step for days. | 11:59:41 |
hexa | it really seemed like the x86_64-linux machines were idle yesterday, I often saw no x86_64-linux jobs in the running builds | 12:11:04 |
Vladimír Čunát | Weird. I noticed that the /machines page isn't very precise, e.g. it seemed not to show jobs that take relatively short time... for this the /queue-runner-status seemed better. | 12:22:57 |
hexa | the machine page also shows machines as idle when they're copying stuff | 12:31:55 |
Vladimír Čunát | Well, the scheduling certainly isn't ideal. Now I looked at t4b, and it's been completely idle during the last 15 minutes, not even I/O waits. | 12:44:47 |
Vladimír Čunát | * Well, the scheduling certainly isn't ideal. Now I looked at t4b, and it's been completely idle during the last 15 minutes, not even I/O waits. (I ran atop on it) | 12:45:14 |
Vladimír Čunát | The runner status says: "currentJobs" : 4 :-) | 12:46:13 |
hexa | Vladimír Čunát: hm, what was you reasoning behind merging staging-next-21.05? | 20:51:24 |
Vladimír Čunát | Getting updated 21.05-small. | 20:52:14 |
Vladimír Čunát | (I tried explaining in the commit message) | 20:52:30 |
Vladimír Čunát | And missing binaries on a release branch seem less hurtful than on master. | 20:53:06 |
hexa | GitHub isn't very forthcoming with the merge commit's message | 20:53:13 |
hexa | at least not from within the pull request | 20:53:38 |
hexa | FWIW: I don't mind, was just wondering because there was >50k jobs left | 20:54:01 |
Vladimír Čunát | Ah, right... there's a PR :-) | 20:55:16 |
hexa | ah, you merged from the CLI? :P | 20:56:23 |
Vladimír Čunát | Yes, I typically merge from CLI. (and sign them) | 20:56:51 |
Vladimír Čunát | The original motivation of -small channels was for faster delivery of security updates. Staging them (partially) defeats that. | 20:58:12 |
Vladimír Čunát | Actually, aarch64-linux has no more runnables in queue ATM, if I look right. | 21:34:02 |
Vladimír Čunát | So this will put it back to work. | 21:34:08 |
Vladimír Čunát | * So this will put it back to work. (ah, probably not too much, just some NixOS tests will get added) | 21:38:43 |
| 3 Sep 2021 |
Vladimír Čunát | In reply to @andi:kack.it I'm currently testing the PR rebased on master (with a hydra jobset) so ideally the amount of (additional) rebuilds will be very small anyway. I don't expect the systemd jobset really helps to reduce the rebuilds much. It's small - only has about one thousand builds, but OfBorg says the full rebuild amount is >26k.
Also, I assume you didn't mean for systemd to go before the staging-next iteration that's been in progress for more than a week (?), and their combination will cause rebuilds again.
| 05:59:17 |
andi- | It isn't about rebuilds but getting the testes executed. Also to be able to verify on local devices without days of rebuilding things locally. I don't expect it to merge apart from any of the usual staging flows. | 06:00:43 |
andi- | * It isn't about rebuilds but getting the testes executed. Also to be able to verify on local devices without days of rebuilding things locally. I don't expect it to be merged outside of any of the usual staging flows. | 06:02:56 |