4 Apr 2023 |
hexa | you just backport stuff into it and move on | 13:08:11 |
cole-h | lmao | 13:08:15 |
raitobezarius | In reply to @hexa:lossy.network you just backport stuff into it and move on greg k-h enters the channel | 14:47:01 |
cole-h | Found this thread: https://lkml.org/lkml/2023/3/16/765
So while it's not 6.2 as that thread mentions, may be the same problem | 14:47:34 |
hexa | kernel downgrade when | 14:52:46 |
hexa | would also be interesting to know what its previous kernel verison was | 14:55:45 |
hexa | * would also be interesting to know what its previous kernel version was | 14:55:49 |
cole-h | 🤷 the box is unpinned, but likely 6.1.21 was its previous version | 14:56:08 |
cole-h | ok, 6.1.21 is also busted
[ 110.726426] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 110.732340] rcu: 44-...0: (0 ticks this GP) idle=9b44/1/0x4000000000000000 softirq=2863/2863 fqs=2462
[ 110.741636] (detected by 70, t=5255 jiffies, g=14289, q=8797 ncpus=80)
[ 110.748238] Task dump for CPU 44:
[ 110.751540] task:kworker/u160:1 state:R running task stack:0 pid:419 ppid:2 flags:0x0000000a
[ 110.761443] Workqueue: efi_rts_wq efi_call_rts
[ 110.765878] Call trace:
[ 110.768312] __switch_to+0xf0/0x170
[ 110.771791] 0xffff07ff85645b80
| 15:38:21 |
cole-h | nvm it's still 6.1.22 somehow | 15:39:29 |
cole-h | [ 242.441034] INFO: task kworker/u160:0:9 blocked for more than 120 seconds.
[ 242.447910] Tainted: P O 6.1.22 #1-NixOS
[ 242.453735] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.461553] task:kworker/u160:0 state:D stack:0 pid:9 ppid:2 flags:0x00000008
[ 242.469895] Workqueue: events_freezable_power_ sync_hw_clock
[ 242.475548] Call trace:
[ 242.477986] __switch_to+0xf0/0x170
[ 242.481467] __schedule+0x30c/0x1254
[ 242.485035] schedule+0x58/0xec
[ 242.488164] schedule_timeout+0x14c/0x180
[ 242.492165] __wait_for_common+0xd4/0x250
[ 242.496165] wait_for_completion+0x28/0x34
[ 242.500251] virt_efi_set_time+0x114/0x190
[ 242.504339] efi_set_time+0x84/0xc0
[ 242.507818] rtc_set_time+0xc0/0x1c4
[ 242.511385] sync_hw_clock+0x1ac/0x230
[ 242.515123] process_one_work+0x1f4/0x460
[ 242.519124] worker_thread+0x188/0x4e0
[ 242.522863] kthread+0xe0/0xe4
[ 242.525908] ret_from_fork+0x10/0x20
| 15:39:43 |
hexa | In reply to @cole-h:matrix.org not yet (because it's not easy, if possible lol) 🤡 | 15:40:13 |
cole-h | oh I missed something lol, let's try again | 15:55:36 |
cole-h | [ 109.650254] rcu: rcu_sched kthread timer wakeup didn't happen for 3034 jiffies! g13841 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 109.661456] rcu: Possible timer handling issue on cpu=44 timer-softirq=210
[ 109.668404] rcu: rcu_sched kthread starved for 3040 jiffies! g13841 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=44
[ 109.678737] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 109.687682] rcu: RCU grace-period kthread stack dump:
[ 109.692720] task:rcu_sched state:I stack:0 pid:14 ppid:2 flags:0x00000008
[ 109.701057] Call trace:
[ 109.703490] __switch_to+0xf0/0x170
[ 109.706967] __schedule+0x30c/0x1254
[ 109.710530] schedule+0x58/0xec
[ 109.713659] schedule_timeout+0xa4/0x180
[ 109.717571] rcu_gp_fqs_loop+0x138/0x4ac
[ 109.721483] rcu_gp_kthread+0x1d4/0x210
[ 109.725307] kthread+0xe0/0xe4
[ 109.728350] ret_from_fork+0x10/0x20
welp
| 16:22:23 |
cole-h | trying kernel 6.1.19 (yes I skipped 6.1.20 because I didn't want to find the last nixos-unstable eval that had 6.1.20) | 16:31:58 |
cole-h | and further back we go
[ 726.247262] INFO: task kworker/u160:3:519 blocked for more than 604 seconds.
[ 726.254303] Tainted: P O 6.1.19 #1-NixOS
[ 726.260124] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 726.267943] task:kworker/u160:3 state:D stack:0 pid:519 ppid:2 flags:0x00000008
[ 726.276283] Workqueue: events_freezable_power_ sync_hw_clock
[ 726.281933] Call trace:
[ 726.284368] __switch_to+0xf0/0x170
[ 726.287848] __schedule+0x30c/0x1254
[ 726.291414] schedule+0x58/0xec
[ 726.294542] schedule_timeout+0x14c/0x180
[ 726.298542] __wait_for_common+0xd4/0x250
[ 726.302541] wait_for_completion+0x28/0x34
[ 726.306627] virt_efi_set_time+0x114/0x190
[ 726.310713] efi_set_time+0x84/0xc0
[ 726.314192] rtc_set_time+0xc0/0x1c4
[ 726.317757] sync_hw_clock+0x1ac/0x230
[ 726.321495] process_one_work+0x1f4/0x460
[ 726.325496] worker_thread+0x188/0x4e0
[ 726.329235] kthread+0xe0/0xe4
[ 726.332279] ret_from_fork+0x10/0x20
| 17:14:35 |
cole-h | maybe I should see where the maybe-problematic commit first appeared and go to the release prior to that lol | 17:14:50 |
hexa | inb4: this is broken since we moved stable to 6.1 | 17:15:06 |
cole-h | 😅 | 17:15:22 |
cole-h | when was that? | 17:15:24 |
hexa | checking | 17:15:35 |
hexa | but weeks | 17:15:38 |
cole-h | then it's possible | 17:15:48 |
cole-h | (since I've gone back weeks now lol) | 17:15:59 |
hexa | hence my initial question about the previous kernel 😄 | 17:16:05 |
hexa | https://github.com/NixOS/nixpkgs/pull/215313 | 17:16:37 |
hexa | merged on march 3rd | 17:16:40 |
hexa | I assume the hydra builders are still on 5.15? | 17:17:42 |
cole-h | No idea, that is outside my domain | 17:19:04 |
cole-h | Trying 10e51cdc0456f1d5c8a00f026c384f0e81126538 (the last nixos-unstable eval before 6.1 became the default) | 17:19:25 |