| 3 Sep 2021 |
Vladimír Čunát | * It seems quite a waste, and apparently I have to restart the stuff manually as well. | 16:59:35 |
@grahamc:nixos.org | https://monitoring.nixos.org/prometheus/graph?g0.expr=up%7Binstance%3D%22546ef6b6.packethost.net%22%7D&g0.tab=0&g0.stacked=0&g0.range_input=1w | 17:00:40 |
@grahamc:nixos.org | taking a look at this one machine's up status over the past week indicates that indeed it booted, lived for about 25 minutes, and then terminated | 17:01:13 |
@grahamc:nixos.org | it looks like their spot market is churning indeed | 17:03:39 |
Vladimír Čunát | Maybe I'd prefer scheduling big-parallel builds to machines that seem likely to live longer. Assuming this is a common situation and we can tell in advance if it might happen. | 17:03:54 |
@grahamc:nixos.org | this isn't a common situation really | 17:04:11 |
@grahamc:nixos.org | the machines we get are usually very stable and long lived | 17:04:18 |
Vladimír Čunát | Then it's OK, I think. Jobs need some babysitting anyway. | 17:04:49 |
@grahamc:nixos.org | sometimes their spot market auction process gets in to a bad state and churns and churns | 17:05:02 |
Vladimír Čunát | Black Friday today? | 17:07:54 |
baloo | no it's in November | 17:08:47 |
@grahamc:nixos.org | https://monitoring.nixos.org/grafana/d/wiaOmQ4nk/equinix-metal-churn?orgId=1&refresh=10s&var-instance_class=m1.xlarge.x86&var-facility=All | 17:20:39 |
@grahamc:nixos.org | you can see the churn very clearly here | 17:20:56 |
@grahamc:nixos.org | and one of the causes of the spot price in the top graph | 17:21:05 |
@grahamc:nixos.org | ams1 isn't super happy either | 17:23:27 |
@grahamc:nixos.org | I'll take both out | 17:23:30 |
@grahamc:nixos.org | well now I've really stepped in it | 17:29:44 |
@grahamc:nixos.org | recreating the spot market request deleted all the existing requests (I expected this) then a bug in either the terraform provider or their API made the "create" step fail | 17:42:37 |
@grahamc:nixos.org | working with their support team | 18:05:02 |
@grahamc:nixos.org | looks like we're getting hardware upgrades | 18:47:25 |
@grahamc:nixos.org | I did not expect to spend half the work day on this 🙃 | 19:05:53 |
| 4 Sep 2021 |
@grahamc:nixos.org | okay, I finished the work to get these new classes of hardware in hydra | 03:01:31 |
@grahamc:nixos.org | went from machines with:
2x 2 x Intel Xeon E5-2650 v4 (ie: 24 cores, 2.2ghz) and 256G ram
to:
1x 1 x AMD EPYC 7402P (ie: 24 cores, 2.8ghz) and 64G RAM
and 1x 1 x AMD EPYC 7402P (ie: 24 cores, 2.8ghz) and 256G RAM | 03:03:12 |
@grahamc:nixos.org | that second machine has 2x 25Gbps network connections :drool: | 03:03:41 |
@grahamc:nixos.org | * went from machines with:
2x 2 x Intel Xeon E5-2650 v4 (ie: 24 cores, 2.2ghz) and 256G ram
to:
1x 1 x AMD EPYC 7402P (ie: 24 cores, 2.8ghz) and 64G RAM
and 1x 1 x AMD EPYC 7502P (ie: 32 cores @ 2.5Ghz) and 256G RAM | 03:05:43 |
lukegb (he/him) | Ooh | 03:20:14 |
lukegb (he/him) | Thanks so much for your work grahamc (he/him) ❤️ | 03:20:34 |
@grahamc:nixos.org | :) | 03:22:29 |
@grahamc:nixos.org | it is possible I broke the aarch64 machines while updating the channel | 03:43:04 |
@grahamc:nixos.org | I'm not too stressed about that, but I'm also not likely to fix that tonight | 03:43:18 |