Testing with Nix | 102 Members | |
| 26 Servers |
| Sender | Message | Time |
|---|---|---|
| 27 Oct 2023 | ||
| Oh right, only on the orchestrator. My bad. | 08:34:38 | |
| 08:34:56 | ||
| How am I not in this room | 08:35:05 | |
| to protect you | 08:35:13 | |
| Feck. That does mean I need to keep digging into arcane shell tools to fix my tests. | 08:36:07 | |
| 12:40:08 | ||
| 22:53:36 | ||
| 29 Oct 2023 | ||
| Robert Hensing (roberth): do you have leftover concerns for the timeout PR or would you allow me to send it? | 15:56:05 | |
| 31 Oct 2023 | ||
| I'm starting to think some unreliable test runs —
never finishing — could be caused by KVM not being set up:
| 02:43:42 | |
| * I'm starting to think some unreliable test runs — "waiting for the VM to finish booting" never being paired up with a "(finished: waiting for the VM to finish booting, in N.M seconds)" — could be caused by KVM not being set up:
| 02:44:48 | |
| * I'm starting to think some unreliable test runs — "waiting for the VM to finish booting" never being paired up with a "(finished: waiting for the VM to finish booting, in N.M seconds)" — could be caused by KVM not being set up:
I don't know which level this happens at, but I'm running the | 02:46:14 | |
| * I'm starting to think some unreliable test runs — "waiting for the VM to finish booting" never being paired up with a "(finished: waiting for the VM to finish booting, in N.M seconds)" — could be caused by KVM not being set up:
I don't know which level this happens at, but I'm running the | 02:46:36 | |
| * I'm starting to think some unreliable test runs — "waiting for the VM to finish booting" never being paired up with a "(finished: waiting for the VM to finish booting, in N.M seconds)" — could be caused by KVM not being set up:
I don't know which level this happens at, but I'm running the | 02:53:07 | |
| Gitlab managed runners aren't guaranteed to have KVM AFAIK | 07:55:16 | |
In reply to @k900:0upti.meOh, OK. I couldn't find a definitive answer for this anywhere, so that makes sense. | 08:59:21 | |
| They have it sometimes IME | 08:59:30 | |
| But they use multiple cloud providers for those | 08:59:46 | |
| Still, I have no idea why my tests are failing randomly, and even less how to fix it now. | 08:59:46 | |
| Well if there's no KVM it will fall back to software emulation | 09:00:28 | |
| Which is not fast | 09:00:29 | |
| So it'll probably just time out | 09:00:39 | |
| Oh, by the way, they don't always fail when KVM is missing. | 09:00:51 | |
| So it can't be as simple as that, either. | 09:01:05 | |
| Well they can succeed if the CPU is fast enough | 09:01:35 | |
| To actually run the JIT at a reasonable speed | 09:01:42 | |
I haven't found a single common theme that separates 8-minute successes from 3-hour failures, except for the obvious one, that nixosTests' start_all never seems to finish during failed runs. | 09:02:20 | |
| And that's despite both nodes running happily, spitting out journalctl rotation messages every few minutes. | 09:02:50 | |
So I'm starting to be more inclined towards start_all being buggy, and not actually detecting the system startup properly. | 09:03:28 | |
| To debug this properly, you'd need to attach all sorts of logs from the firmware, the kernel and QEMU itself | 09:53:42 | |
| Otherwise this is all speculation | 09:53:47 | |