| 21 Oct 2021 |
Janne Heß | I also outlined the flakyness that you mentioned, K900 without rewriting the entire function ;) | 19:02:39 |
Janne Heß | * I also fixed the flakyness that you mentioned, K900 without rewriting the entire function ;) | 19:02:44 |
baloo | (qemu is in uninterruptible sleep, and dump the kernel stack might prove useful) | 19:02:47 |
K900 | Which one? | 19:02:50 |
baloo | * (qemu is in uninterruptible sleep, and dumping the kernel stack might prove useful) | 19:02:55 |
Janne Heß | https://github.com/NixOS/nixpkgs/pull/142498 | 19:03:13 |
Janne Heß | this one | 19:03:14 |
K900 | Oh yeah nice | 19:03:30 |
K900 | I was less going to rewrite that particular function | 19:04:15 |
K900 | And more going to rewrite the whole damn thing | 19:04:20 |
K900 | But that is nice | 19:04:23 |
Janne Heß | okay back at the end of the test script | 19:07:59 |
Janne Heß | so what's the idea what I should do? | 19:08:14 |
K900 | In reply to @baloo_:matrix.org may I suggest find /proc/$(pidof qemu)/ -name stack -print -exec cat {} \; as well? This is probably a good start | 19:13:47 |
Janne Heß | In reply to @baloo_:matrix.org may I suggest find /proc/$(pidof qemu)/ -name stack -print -exec cat {} \; as well? /proc/41543/task/41543/stack
[<0>] do_sys_poll+0x3ab/0x5b0
[<0>] __x64_sys_ppoll+0xbc/0x150
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
/proc/41543/task/41551/stack
[<0>] futex_wait_queue_me+0xb6/0x110
[<0>] futex_wait+0xe9/0x240
[<0>] do_futex+0x174/0xbf0
[<0>] __x64_sys_futex+0x146/0x1c0
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
/proc/41543/task/41556/stack
[<0>] kvm_vcpu_block+0x58/0x2f0 [kvm]
[<0>] kvm_arch_vcpu_ioctl_run+0x6c4/0x1720 [kvm]
[<0>] kvm_vcpu_ioctl+0x211/0x5a0 [kvm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
/proc/41543/task/41704/stack
[<0>] futex_wait_queue_me+0xb6/0x110
[<0>] futex_wait+0xe9/0x240
[<0>] do_futex+0x174/0xbf0
[<0>] __x64_sys_futex+0x146/0x1c0
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
/proc/41543/stack
[<0>] do_sys_poll+0x3ab/0x5b0
[<0>] __x64_sys_ppoll+0xbc/0x150
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
| 19:24:53 |
baloo | Not entirely sure I read the test driver correctly but, we’re just sending it a sigterm right? | 19:30:44 |
baloo | one option would be to just SIGKILL, which should work and is just fine. (we don't care about the consistency of any open file, they are either readonly or about to be deleted) | 19:32:07 |
K900 | Yes | 19:32:08 |
K900 | I find it weird that it seems to lock up when sigterm'd | 19:32:29 |
K900 | On one hand, we don't care, and it's probably qemu's fault | 19:32:36 |
K900 | On the other hand, it really shouldn't do that | 19:32:40 |
baloo | one other is to ask qemu to quit like:
echo "quit" | socat STDIO UNIX-CONNECT:./console.pipe
with qemu started like:
-mon chardev=con0,mode=readline \
-chardev socket,id=con0,path=./console.pipe,server,nowait \
| 19:32:59 |
K900 | There's already a socket for that | 19:33:18 |
K900 | It's what all the commands run through | 19:33:23 |
baloo | the socket the commands run through is the serial console of the inner vm | 19:33:49 |
baloo | this is the monitor socket of qemu itself | 19:33:58 |
K900 | There's a second socket for the monitor too | 19:34:02 |
baloo | diff --git a/nixos/lib/test-driver/test-driver.py b/nixos/lib/test-driver/test-driver.py
index e659b0c04f5..1518bf3562a 100755
--- a/nixos/lib/test-driver/test-driver.py
+++ b/nixos/lib/test-driver/test-driver.py
@@ -1021,6 +1021,7 @@ class Machine:
assert self.process
assert self.shell
assert self.monitor
+ self.send_monitor_command("quit")
self.process.terminate()
self.shell.close()
self.monitor.close()
I'd give this a try
| 19:36:41 |
Janne Heß | I'll do that | 19:36:57 |
K900 | https://github.com/NixOS/nixpkgs/blob/master/nixos/lib/test-driver/test-driver.py#L957 | 19:37:14 |