!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

421 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.132 Servers

Load older messages


SenderMessageTime
7 Oct 2021
@vcunat:matrix.orgVladimír Čunát
  1. channel waiting on bad test. Sometimes the channel would update, but it waits to finish all builds. And sometimes there's a few tests that never succeeded recently and just wait for a long time-out. Cancelling those stragglers can speed up channel update. (makes sense if the channel is quite old currently, e.g. after some period of channel blockers)
09:04:34
@zimbatm:numtide.comJonas ChevalierIs anybody else keeping a pulse on Hydra like you do?09:11:43
@vcunat:matrix.orgVladimír ČunátI'm not sure. I do watch https://status.nixos.org/ (or a script of mine with differently defined timestamps, based on committer time) If some channel is getting over three days, I'm looking at what's wrong; typically it's something easy to solve or speed up.09:14:00
@vcunat:matrix.orgVladimír ČunátI mean, there certainly are other cases of people "unblocking channels", based on what happens, but I have no real insight into that.09:22:08
@zimbatm:numtide.comJonas Chevalierthanks09:22:58
@zimbatm:numtide.comJonas Chevalierit would be useful if those user interventions were logged somehow, so that we could know who is doing what09:24:29
@vcunat:matrix.orgVladimír ČunátIt's not the first time there was this idea: https://github.com/NixOS/hydra/issues/78609:27:23
8 Oct 2021
@zimbatm:numtide.comJonas ChevalierI added some templates to the repo and looking for feedback: https://github.com/NixOS/nixos-org-configurations/issues/new/choose The goal is to make it easier for users to report issues with the infrastructure, and in the future request access or new resources to be deployed.07:33:59
10 Oct 2021
@hexa:lossy.networkhexaThere seems to be a caching issue with channels.nixos.org https://github.com/NixOS/nixos-org-configurations/issues/169#issuecomment-93947866914:17:48
@sternenseemann:systemli.orgsternithis has been like this for over a week (two?)14:18:58
@sternenseemann:systemli.orgsterniseems like forever14:19:00
@sternenseemann:systemli.orgsternialso: has like any darwin build happened in the last week?14:48:22
@k900:0upti.meK900It feels like some of the CDN nodes are being weirdn15:03:57
@k900:0upti.meK900* It feels like some of the CDN nodes are being weird15:04:00
@k900:0upti.meK900I've been hitting really old builds for a while but it seems OK now15:04:10
11 Oct 2021
@charlotte:vanpetegem.mechvp joined the room.06:45:43
@r-burns:matrix.orgRyan Burns joined the room.07:58:59
@r-burns:matrix.orgRyan Burns
In reply to @sternenseemann:systemli.org
also: has like any darwin build happened in the last week?
nope. correct me if I'm wrong but based on https://hydra.nixos.org/status every darwin builder has been stuck on the same job for going on 7 days now
08:01:43
@r-burns:matrix.orgRyan Burns
In reply to @sternenseemann:systemli.org
also: has like any darwin build happened in the last week?
* nope. correct me if I'm wrong but based on https://hydra.nixos.org/status every x86_64-darwin builder has been stuck on the same job for going on 7 days now
08:02:08
@toonn:matrix.orgtoonn Hmm, looks like there's 7 idle x86_64-darwin builders rn even though the queue is massive ~100k. Why might this happen? 11:54:00
@lukegb:zxcvbnm.ninjalukegb (he/him)Possibly some stdenv jobs are stuck on a broken worker and blocking everything else?11:54:44
@toonn:matrix.orgtoonn Do the Darwin builders require baby sitting, because I'm willing to do that for a bit if it's necessary? 11:54:45
@andreas.schraegle:helsinki-systems.deAndreas Schräglehm. on our private hydra, we just restart the queue runner regularly, because things get stuck. I assume that's less practical on hydra.nixos.org.11:58:18
@vcunat:matrix.orgVladimír Čunát

Yes it actually looks like everything is waiting on build steps that are in some kind of stuck state:

      "x86_64-darwin" : {
         "runnable" : 0,
         "running" : 29
      },
12:32:41
@vcunat:matrix.orgVladimír Čunáti.e. I suspect that restarting the queue runner might indeed help by itself in this situation.12:33:33
@grahamc:nixos.org@grahamc:nixos.orgI can help with that, also I think I can bring up some more capacity today12:49:42
@grahamc:nixos.org@grahamc:nixos.orgsorry for being MIA, I haven't been feeling myself the past few weeks12:49:50
@andreas.schraegle:helsinki-systems.deAndreas Schrägle
In reply to @vcunat:matrix.org
i.e. I suspect that restarting the queue runner might indeed help by itself in this situation.
maybe someone should try and fix the queue runner at some point. that's more of a topic for #hydra:nixos.org though.
12:50:58
@rick:matrix.ciphernetics.nlRick (Mindavi)I saw some improvements on the queue runner / event system recently, which may help13:06:38
@grahamc:nixos.org@grahamc:nixos.orgI doubt it -- that work will help with other issues not really seen often on hydra.nixos.org14:10:24

Show newer messages


Back to Room ListRoom Version: 6