!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

374 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.114 Servers

Load older messages


SenderMessageTime
28 May 2025
@vcunat:matrix.orgvcunat"Steps completed per minute" isn't much lower than usually, I'd say.10:07:16
@uep:matrix.orguepi'm going by https://hydra.nixos.org/jobset/nixos/trunk-combined, and the fact that for the last.. quite a few days, it seems the build from the previous day hasn't finished by the time the next one starts. And even with relatively low numbers of jobs to run (8k, vs 120k) it's taking more than a day to finish. I realise this isn't the total picture, i'm not looking at overall throughput or jobs10:11:42
@hexa:lossy.networkhexa (signing key rotation when)this relates to the "build ingestion" in that we don't currently get enough runnables to saturate our linux builders10:14:30
@hexa:lossy.networkhexa (signing key rotation when)usually restarting the queue-runner can tip the balance again10:15:03
@vcunat:matrix.orgvcunatRedacted or Malformed Event10:15:16
@uep:matrix.orguepoh, build ingestion is on-boarding of jobs to run, not ingestion of build results... makes more sense now10:24:33
@hexa:lossy.networkhexa (signing key rotation when)https://github.com/NixOS/hydra/commit/720db63d52ebcbda617603e7aa5b5c750cc6afec was the last change to the scheduling logic10:29:52
@hexa:lossy.networkhexa (signing key rotation when)we've run with that since around april 29th I would say10:31:51
@hexa:lossy.networkhexa (signing key rotation when)but that's in the dispatcher, not the ingestor10:32:15
@hexa:lossy.networkhexa (signing key rotation when)* but that's in the dispatcher, not the queue-monitor10:32:41
@hexa:lossy.networkhexa (signing key rotation when)and ingestion business is determined here https://github.com/NixOS/hydra/blob/master/src/hydra-queue-runner/queue-monitor.cc#L54-L8210:33:22
@hexa:lossy.networkhexa (signing key rotation when)* and ingestion busyness is determined here https://github.com/NixOS/hydra/blob/master/src/hydra-queue-runner/queue-monitor.cc#L54-L8210:33:34
@uep:matrix.orguepi dearly love postres notify, but the fact that it's bundled into a transaction sometimes really hurts10:36:45
@uep:matrix.orguep* i dearly love postgres notify, but the fact that it's bundled into a transaction sometimes really hurts10:37:04
@k900:0upti.meK900The entire queue setup is one huge crime against postgres10:37:29
@k900:0upti.meK900Unfortunately10:37:47
@vcunat:matrix.orgvcunat It often feels like restarting the queue runner helps. I tried that again during this discussion, and it does seem like getting through trunk-combined faster (including the Steps completed per minute graph). No idea why this happens, and yes - during the restart we lose already ingested builds that haven't finished, and startup takes some time, so it's not always a win. 11:02:13
@emilazy:matrix.orgemilyis the queue logic so complicated that it's not obvious how to reimplement it with some more normal queueing thing or is it just a "no free time that anyone wants to spend patching scary C++" thing?11:08:01
@emilazy:matrix.orgemily(I mean I know using Postgres for everything including queueing is popular these days but I'm going to assume that however Hydra does it is not how you would do it in 2025.)11:08:36
@emilazy:matrix.orgemily the code in hydra-queue-runner looks shorter/less terrifying than I expected but I'm guessing the gremlins are just deeper than the surface. 11:09:49
@k900:0upti.meK900 It's not complicated at all 11:10:08
@k900:0upti.meK900It's just extremely cursed11:10:12
@k900:0upti.meK900And no one wants to touch it11:10:18
29 May 2025
@winter:catgirl.cloudWinter does channels.nixos.org have rate limiting, and if so, is it worse than GitHub where if you sneeze on it without an auth token it'll 429 you? 14:48:58
@k900:0upti.meK900It shouldn't I think14:50:27
@k900:0upti.meK900Unless Fastly is doing somethign14:50:31
@k900:0upti.meK900* Unless Fastly is doing something14:50:33
@winter:catgirl.cloudWinterthanks!14:51:14
30 May 2025
@ss:someonex.netSomeoneSerge (back on matrix) =\ 13:42:54
@ss:someonex.netSomeoneSerge (back on matrix)Redacted or Malformed Event13:43:12

Show newer messages


Back to Room ListRoom Version: 6