| 4 Apr 2023 |
cole-h | oh I missed something lol, let's try again | 15:55:36 |
cole-h | [ 109.650254] rcu: rcu_sched kthread timer wakeup didn't happen for 3034 jiffies! g13841 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 109.661456] rcu: Possible timer handling issue on cpu=44 timer-softirq=210
[ 109.668404] rcu: rcu_sched kthread starved for 3040 jiffies! g13841 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=44
[ 109.678737] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 109.687682] rcu: RCU grace-period kthread stack dump:
[ 109.692720] task:rcu_sched state:I stack:0 pid:14 ppid:2 flags:0x00000008
[ 109.701057] Call trace:
[ 109.703490] __switch_to+0xf0/0x170
[ 109.706967] __schedule+0x30c/0x1254
[ 109.710530] schedule+0x58/0xec
[ 109.713659] schedule_timeout+0xa4/0x180
[ 109.717571] rcu_gp_fqs_loop+0x138/0x4ac
[ 109.721483] rcu_gp_kthread+0x1d4/0x210
[ 109.725307] kthread+0xe0/0xe4
[ 109.728350] ret_from_fork+0x10/0x20
welp
| 16:22:23 |
cole-h | trying kernel 6.1.19 (yes I skipped 6.1.20 because I didn't want to find the last nixos-unstable eval that had 6.1.20) | 16:31:58 |
cole-h | and further back we go
[ 726.247262] INFO: task kworker/u160:3:519 blocked for more than 604 seconds.
[ 726.254303] Tainted: P O 6.1.19 #1-NixOS
[ 726.260124] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 726.267943] task:kworker/u160:3 state:D stack:0 pid:519 ppid:2 flags:0x00000008
[ 726.276283] Workqueue: events_freezable_power_ sync_hw_clock
[ 726.281933] Call trace:
[ 726.284368] __switch_to+0xf0/0x170
[ 726.287848] __schedule+0x30c/0x1254
[ 726.291414] schedule+0x58/0xec
[ 726.294542] schedule_timeout+0x14c/0x180
[ 726.298542] __wait_for_common+0xd4/0x250
[ 726.302541] wait_for_completion+0x28/0x34
[ 726.306627] virt_efi_set_time+0x114/0x190
[ 726.310713] efi_set_time+0x84/0xc0
[ 726.314192] rtc_set_time+0xc0/0x1c4
[ 726.317757] sync_hw_clock+0x1ac/0x230
[ 726.321495] process_one_work+0x1f4/0x460
[ 726.325496] worker_thread+0x188/0x4e0
[ 726.329235] kthread+0xe0/0xe4
[ 726.332279] ret_from_fork+0x10/0x20
| 17:14:35 |
cole-h | maybe I should see where the maybe-problematic commit first appeared and go to the release prior to that lol | 17:14:50 |
hexa | inb4: this is broken since we moved stable to 6.1 | 17:15:06 |
cole-h | 😅 | 17:15:22 |
cole-h | when was that? | 17:15:24 |
hexa | checking | 17:15:35 |
hexa | but weeks | 17:15:38 |
cole-h | then it's possible | 17:15:48 |
cole-h | (since I've gone back weeks now lol) | 17:15:59 |
hexa | hence my initial question about the previous kernel 😄 | 17:16:05 |
hexa | https://github.com/NixOS/nixpkgs/pull/215313 | 17:16:37 |
hexa | merged on march 3rd | 17:16:40 |
hexa | I assume the hydra builders are still on 5.15? | 17:17:42 |
cole-h | No idea, that is outside my domain | 17:19:04 |
cole-h | Trying 10e51cdc0456f1d5c8a00f026c384f0e81126538 (the last nixos-unstable eval before 6.1 became the default) | 17:19:25 |
cole-h | I'm not looking forward to bisecting this in the kernel ._. | 17:20:24 |
hexa | right | 17:20:35 |
cole-h | (especially since this box I'm trying to fix is the box I'd want to use as the builder...) | 17:21:07 |
cole-h | 5.15.96 coming (up) soon..... | 17:44:03 |
cole-h | works 🙃 | 17:55:21 |
hexa | yay 🫠| 17:57:29 |
cole-h | https://github.com/nix-community/aarch64-build-box/pull/152 should fix it until I look into what specifically broke in the kernel (either by bisecting or finding someone who has already done that) | 18:29:27 |
| 5 Apr 2023 |
VladimÃr ÄŒunát | In reply to @hexa:lossy.network I assume the hydra builders are still on 5.15? I think so: https://github.com/NixOS/equinix-metal-builders/blob/main/flake.lock | 07:51:46 |
| 6 Apr 2023 |
hexa | ofborg does not request reviewers on https://github.com/NixOS/nixpkgs/pull/224882 | 13:52:52 |
hexa | even though it found only 10 https://gist.github.com/GrahamcOfBorg/706d89f76298b243ff723b102d6e1c7e | 13:53:09 |
hexa | I think the limit was 15? | 13:53:15 |
cole-h | Nope, limit is 10 | 13:53:28 |