!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

422 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.132 Servers

Load older messages


SenderMessageTime
8 Oct 2021
@zimbatm:numtide.comJonas ChevalierI added some templates to the repo and looking for feedback: https://github.com/NixOS/nixos-org-configurations/issues/new/choose The goal is to make it easier for users to report issues with the infrastructure, and in the future request access or new resources to be deployed.07:33:59
10 Oct 2021
@hexa:lossy.networkhexaThere seems to be a caching issue with channels.nixos.org https://github.com/NixOS/nixos-org-configurations/issues/169#issuecomment-93947866914:17:48
@sternenseemann:systemli.orgsternithis has been like this for over a week (two?)14:18:58
@sternenseemann:systemli.orgsterniseems like forever14:19:00
@sternenseemann:systemli.orgsternialso: has like any darwin build happened in the last week?14:48:22
@k900:0upti.meK900It feels like some of the CDN nodes are being weirdn15:03:57
@k900:0upti.meK900* It feels like some of the CDN nodes are being weird15:04:00
@k900:0upti.meK900I've been hitting really old builds for a while but it seems OK now15:04:10
11 Oct 2021
@charlotte:vanpetegem.mechvp joined the room.06:45:43
@r-burns:matrix.orgRyan Burns joined the room.07:58:59
@r-burns:matrix.orgRyan Burns
In reply to @sternenseemann:systemli.org
also: has like any darwin build happened in the last week?
nope. correct me if I'm wrong but based on https://hydra.nixos.org/status every darwin builder has been stuck on the same job for going on 7 days now
08:01:43
@r-burns:matrix.orgRyan Burns
In reply to @sternenseemann:systemli.org
also: has like any darwin build happened in the last week?
* nope. correct me if I'm wrong but based on https://hydra.nixos.org/status every x86_64-darwin builder has been stuck on the same job for going on 7 days now
08:02:08
@toonn:matrix.orgtoonn Hmm, looks like there's 7 idle x86_64-darwin builders rn even though the queue is massive ~100k. Why might this happen? 11:54:00
@lukegb:zxcvbnm.ninjalukegb (he/him)Possibly some stdenv jobs are stuck on a broken worker and blocking everything else?11:54:44
@toonn:matrix.orgtoonn Do the Darwin builders require baby sitting, because I'm willing to do that for a bit if it's necessary? 11:54:45
@andreas.schraegle:helsinki-systems.deAndreas Schräglehm. on our private hydra, we just restart the queue runner regularly, because things get stuck. I assume that's less practical on hydra.nixos.org.11:58:18
@vcunat:matrix.orgVladimír Čunát

Yes it actually looks like everything is waiting on build steps that are in some kind of stuck state:

      "x86_64-darwin" : {
         "runnable" : 0,
         "running" : 29
      },
12:32:41
@vcunat:matrix.orgVladimír Čunáti.e. I suspect that restarting the queue runner might indeed help by itself in this situation.12:33:33
@grahamc:nixos.org@grahamc:nixos.orgI can help with that, also I think I can bring up some more capacity today12:49:42
@grahamc:nixos.org@grahamc:nixos.orgsorry for being MIA, I haven't been feeling myself the past few weeks12:49:50
@andreas.schraegle:helsinki-systems.deAndreas Schrägle
In reply to @vcunat:matrix.org
i.e. I suspect that restarting the queue runner might indeed help by itself in this situation.
maybe someone should try and fix the queue runner at some point. that's more of a topic for #hydra:nixos.org though.
12:50:58
@rick:matrix.ciphernetics.nlRick (Mindavi)I saw some improvements on the queue runner / event system recently, which may help13:06:38
@grahamc:nixos.org@grahamc:nixos.orgI doubt it -- that work will help with other issues not really seen often on hydra.nixos.org14:10:24
@grahamc:nixos.org@grahamc:nixos.org(esp. around declarative jobsets and other plugins that expect read-after-write from their cache)14:12:58
@niksnut:matrix.orgniksnut joined the room.14:20:18
@niksnut:matrix.orgniksnutI don't see a lot of interesting things in the queue runner log, except that mac1, mac6, mac8 and mac9 are unreachable or don't accept our key14:21:24
@niksnut:matrix.orgniksnutalso root@147.75.32.151 ("bigmac") is giving an error about the host key changing14:22:25
12 Oct 2021
@vcunat:matrix.orgVladimír ČunátApparently some of the x86_64-darwin machines immediately got stuck again, in "Sending inputs" phase.06:30:54
@vcunat:matrix.orgVladimír ČunátSo again, we seem to be long-term stuck in state with relatively long queue but no runnable step (for this platform).06:32:31
@andi:kack.itandi-I'd love to know what the bottlneck on the infra team side here is. Is it hardware? Is it time? Is it lack of interest (in that platform)? What can be done from the wider community to support your work? Should we give up on darwin?12:46:05

Show newer messages


Back to Room ListRoom Version: 6