| 27 Jun 2026 |
Grimmauld (any/all) | i just had some fun plotting occupancy against 1d standard deviation (blue). It seems with new queue runner, we have much more fluctuations in occupancy. Why is this? Intuitively i would expect more deviation with worse scheduling. Don't get me wrong, total utilization seems to be higher, but this looks like we may have a delay between a job being finished and a new one being queued, causing occupation to temporarily drop | 05:24:05 |
Grimmauld (any/all) |  Download image.png | 05:24:05 |
Mic92 | Before we don't have https://github.com/NixOS/infra/pull/1099 I don't think those graphs are very meaningful. There are some known bottlenecks I introduced to make it stable. | 05:42:58 |
Mic92 | hexa (signing key rotation when): I also added now also the patch to macOS builders: https://github.com/NixOS/infra/pull/1104 But my internet upload is still not great until Sunday, if you wanted to take a stab if it. | 05:52:11 |
Mic92 | * hexa (signing key rotation when): I also added now also the patch to macOS builders: https://github.com/NixOS/infra/pull/1104 But my internet upload is still not great until Sunday, if you wanted to take a stab earlier | 05:52:17 |
Mic92 | * hexa (signing key rotation when): I also added now also the patch to macOS builders: https://github.com/NixOS/infra/pull/1104 But my internet upload is still not great until Sunday, if you wanted to take a stab at it earlier | 05:52:30 |
whispers [& it/fae] | out of curiosity, is it expected/known that no *-linux jobs are being scheduled at all? the only one is https://hydra.nixos.org/build/332926387, which has been running for five hours. the rest of the machines seem to be sitting fully idle: https://hydra.nixos.org/machines | 06:05:06 |
whispers [& it/fae] | * | 06:05:29 |
whispers [& it/fae] | * | 06:06:48 |
whispers [& it/fae] | * | 06:09:43 |
Vladimír Čunát | Looks like very slow ingestion right now. | 06:51:26 |
Vladimír Čunát | (no idea why) | 06:51:32 |
Vladimír Čunát | And it prefers nixpkgs/unstable which has no linux builds left. | 06:52:15 |
Vladimír Čunát | Though we also have builds in nixos/*-small 🤔 That seems weird. | 06:52:55 |
Vladimír Čunát | Ah, wrong assumptions. In the 7h old eval there are 1.2k linux builds left. | 07:00:21 |
Vladimír Čunát | Let me try restarting the queue runner 🤷 Seems low-risk. | 07:01:18 |
Vladimír Čunát | This morning the runner is logging lots of
sqlx::query: slow statement: execution time exceeded alert threshold
| 07:10:20 |
K900 | Does it log which query? | 07:13:07 |
Vladimír Čunát | Typical example was
Jun 27 06:29:42 mimas hydra-queue-runner[884148]: 2026-06-27T06:29:42.410544Z WARN request:complete_build:succeed_step_by_uuid:succeed_step:mark_succeeded_build:update_build: sqlx::query: slow statement: execution time exceeded alert threshold summary="UPDATE builds SET finished …" db.statement="\n\n\n UPDATE builds SET\n finished = 1,\n buildStatus = $2,\n startTime = $3,\n stopTime = $4,\n size = $5,\n closureSize = $6,\n releaseName = $7,\n isCachedBuild = $8,\n notificationPendingSince = $4\n WHERE\n id = $1\n" rows_affected=1 rows_returned=0 elapsed=136.6054209s elapsed_secs=136.6054209 slow_threshold=1s method=POST uri=/runner.v1.RunnerService/CompleteBuild version=HTTP/2.0 machine_id="2d917310-f886-420b-b961-6709654457ba" build_id="ea683ccd-60b5-49cc-b2db-0fc9dcec66f4" machine_id=2d917310-f886-420b-b961-6709654457ba build_id=ea683ccd-60b5-49cc-b2db-0fc9dcec66f4 machine_id=2d917310-f886-420b-b961-6709654457ba drv_path=hb4wbdx8jknqb2jgp23jxn1nbd03ag7c-steel-language-server-0.8.2.drv build_id=333158552
| 07:15:18 |
K900 | Hmmm | 07:15:56 |
K900 | This is sus | 07:15:59 |
Vladimír Čunát | After the queue-runner restart it's mainly get_queued_builds:get_not_finished_builds | 07:16:02 |
K900 | Can you explain analyze those on the real db? | 07:16:33 |
Vladimír Čunát | It's really extreme now. Over 15 minutes it did 20 build steps. | 07:17:34 |
Vladimír Čunát | Though I get that it can have a slower start. | 07:17:50 |
Vladimír Čunát | It doesn't show a usable SQL, does it? | 07:30:18 |
Vladimír Čunát | All those $1, $2, etc. | 07:30:35 |
Vladimír Čunát | And if I substitute "random" stuff I get
ERROR: new row for relation "builds" violates check constraint "builds_check"
| 07:37:38 |
Vladimír Čunát | Without setting much stuff it's like
hydra=# BEGIN;
BEGIN
hydra=*# EXPLAIN ANALYZE UPDATE builds SET finished = 1 where id=123456;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Update on builds (cost=0.57..2.79 rows=0 width=0) (actual time=4.086..4.087 rows=0.00 loops=1)
Buffers: shared hit=79 read=18 dirtied=10
-> Index Scan using builds_pkey on builds (cost=0.57..2.79 rows=1 width=10) (actual time=0.293..0.294 rows=1.00 loops=1)
Index Cond: (id = 123456)
Index Searches: 1
Buffers: shared hit=3 read=2
Planning:
Buffers: shared hit=294
Planning Time: 1.508 ms
Trigger nrbuildsfinished: time=0.648 calls=1
Execution Time: 4.811 ms
(11 rows)
hydra=*# ROLLBACK;
ROLLBACK
| 07:38:38 |
Vladimír Čunát | 🤔 https://grafana.nixos.org/d/rrbV5fdik/postgres-node | 07:48:31 |