| 20 Jan 2025 |
iridium | Some folder in /mnt | 19:13:05 |
iridium | reproduced it, let me skim the logs | 19:18:01 |
iridium | I sent you the complete journal dump in private. I'm slightly confused by the fact that it no longer shows the "protocol error" message in the journal, but otherwise, the symptoms are pretty much similar | 19:22:21 |
iridium | This is the output from nixos-switch: https://pastebin.com/raw/usHgRPJL | 19:23:26 |
iridium | "restarting systemd" should be at ~20:15:40 in that log | 19:29:05 |
ElvishJerricco | iridium: This log is not showing the same symptom | 19:31:53 |
ElvishJerricco | The key line should be "Freezing execution" | 19:32:06 |
ElvishJerricco | and it's not in there | 19:32:09 |
ElvishJerricco | "Freezing execution" is how you know systemd is down and causes all the "no reply" things | 19:32:38 |
ElvishJerricco | * "Freezing execution" is how you know systemd is down and causes all the "Did not receive a reply" things | 19:32:54 |
iridium | Hmm, yes | 19:36:40 |
iridium | Ah! I think that's because "Freezing execution" shows up quite a bit later | 19:37:38 |
ElvishJerricco | it does? I grep'd the file and didn't see it | 19:37:53 |
iridium | That's because I grabbed the log at the time when rebuild failed. Give me some time, I'll rebuild with debug again... | 19:38:51 |
iridium | (I just retried without debugging enabled. "Freezing execution" happened 2min after the failed restart) | 19:39:23 |
ElvishJerricco | oh really? | 19:39:29 |
ElvishJerricco | that's very surprising | 19:39:31 |
ElvishJerricco | iridium: This time, try to get the journal with -t systemd | 19:39:40 |
ElvishJerricco | just to avoid the logspam | 19:39:44 |
ElvishJerricco | we shouldn't need any other logs than systemd's | 19:39:53 |
iridium | ack | 19:40:02 |
ElvishJerricco | also, if you feel like trying something nifty, you can try using magic-wormhole to send me the log. Just a bit nicer than matrix uploads. | 19:42:44 |
iridium | Relevant timestamps (approximate!):
"restarting systemd..." at roughly 20:41:06.
"Error: Failed to reset failed units" at roughly 20:41:20 (plusminus...). | 19:45:06 |
ElvishJerricco | Ok I think the lots of Closing set fd 657 style messages indicate I was right about my hunch of the file descriptor pool thingymajig | 19:47:49 |
ElvishJerricco | trying to remember what systemd calls that correctly so I can actually look it up :P | 19:48:14 |
ElvishJerricco | iridium: https://systemd.io/FILE_DESCRIPTOR_STORE/ | 19:49:39 |
ElvishJerricco | this | 19:49:40 |
ElvishJerricco | the "protocol error" makes me think there's something up with this store | 19:50:07 |
ElvishJerricco | it could be that the expected state of this store changes between releases | 19:50:20 |
ElvishJerricco | it could be that there's a bug in restoring the state after reexec | 19:50:30 |