| 3 Sep 2021 |
@grahamc:nixos.org | lol | 15:42:14 |
@grahamc:nixos.org | and have way less stress | 15:42:19 |
Domen Kožar | especially at the moment, everyone is hiring | 15:44:57 |
@grahamc:nixos.org | maybe I should go find a real job ... | 15:48:30 |
Domen Kožar | nah | 15:52:12 |
Domen Kožar | we need more businesses in Nix pushing things forward :) | 15:53:30 |
@grahamc:nixos.org | yeah :) I think this thing will work out okay | 15:54:03 |
Vladimír Čunát | In reply to @grahamc:nixos.org that can happen if Equinix is takingor replacing spot instances, between the moments of our automationrefreshing keys and builders Seems to happen quite often: https://hydra.nixos.org/build/151898830#tabs-buildsteps | 16:54:05 |
@grahamc:nixos.org | sometimes they do a lot of shuffling around on their spot market | 16:54:51 |
Vladimír Čunát | Too bad for builds that take many hours. | 16:55:06 |
@grahamc:nixos.org | well, I'm not sure those are related | 16:55:16 |
@grahamc:nixos.org | the ones I'm referring to are the no host / ssh host key change errors | 16:55:50 |
Vladimír Čunát | Ah, OK, I meant the end-of-file abortions. | 16:56:12 |
@grahamc:nixos.org | of course it is possible those end-of-file are coming from a machine that was taken away from us | 16:56:13 |
@grahamc:nixos.org | we can look in to that and find out, one sec | 16:56:27 |
@grahamc:nixos.org | s/one sec/10 min/ (finishing lunch) | 16:56:45 |
Vladimír Čunát | If I count attempts on this particular chromium output path that took at least some minutes, it would be at least five of them. | 16:58:54 |
Vladimír Čunát | It seems quite a wasted, and apparently I have to restart the stuff manually as well. | 16:59:29 |
Vladimír Čunát | * It seems quite a waste, and apparently I have to restart the stuff manually as well. | 16:59:35 |
@grahamc:nixos.org | https://monitoring.nixos.org/prometheus/graph?g0.expr=up%7Binstance%3D%22546ef6b6.packethost.net%22%7D&g0.tab=0&g0.stacked=0&g0.range_input=1w | 17:00:40 |
@grahamc:nixos.org | taking a look at this one machine's up status over the past week indicates that indeed it booted, lived for about 25 minutes, and then terminated | 17:01:13 |
@grahamc:nixos.org | it looks like their spot market is churning indeed | 17:03:39 |
Vladimír Čunát | Maybe I'd prefer scheduling big-parallel builds to machines that seem likely to live longer. Assuming this is a common situation and we can tell in advance if it might happen. | 17:03:54 |
@grahamc:nixos.org | this isn't a common situation really | 17:04:11 |
@grahamc:nixos.org | the machines we get are usually very stable and long lived | 17:04:18 |
Vladimír Čunát | Then it's OK, I think. Jobs need some babysitting anyway. | 17:04:49 |
@grahamc:nixos.org | sometimes their spot market auction process gets in to a bad state and churns and churns | 17:05:02 |
Vladimír Čunát | Black Friday today? | 17:07:54 |
baloo | no it's in November | 17:08:47 |
@grahamc:nixos.org | https://monitoring.nixos.org/grafana/d/wiaOmQ4nk/equinix-metal-churn?orgId=1&refresh=10s&var-instance_class=m1.xlarge.x86&var-facility=All | 17:20:39 |