NixOS systemd | 621 Members | |
| NixOS ❤️ systemd | 171 Servers |
| Sender | Message | Time |
|---|---|---|
| 25 Jan 2025 | ||
In reply to @bumperboat:matrix.orgcheck the status of user@$UID.service on the system manager perhaps | 04:24:39 | |
| 10:22:06 | ||
| 26 Jan 2025 | ||
| ElvishJerricco: I think I might have found the reason for my "Freezing execution"-bug, and I think it's pretty mundane, actually 🙂 | 15:40:53 | |
| Or let's say one reason, not sure if it's the only one | 15:41:58 | |
| https://github.com/systemd/systemd/blob/main/src/core/manager.c#L4272-L4279 | 15:42:51 | |
| That's the part that's failing | 15:42:59 | |
| One of the "generators" systemd is trying to start is this one here: https://github.com/systemd/systemd/blob/main/src/fstab-generator/fstab-generator.c | 15:43:16 | |
That one calls canonicalize_mount_path on every mounted filesystem: https://github.com/systemd/systemd/blob/main/src/fstab-generator/fstab-generator.c#L859 | 15:43:47 | |
That one calls chase() on every path of a mounted FS (one of them being an NFS mount in my case): https://github.com/systemd/systemd/blob/main/src/fstab-generator/fstab-generator.c#L871 | 15:44:18 | |
Now, that one will call fstat() on the filesystem mount path (ultimately...): https://github.com/systemd/systemd/blob/main/src/basic/chase.c#L401 | 15:45:27 | |
Now, when this code is running is in the middle of the switch-to-configuration run, which means: NetworkManager is stopped, and the system currently doesn't have any network. Accesses to NFS shared will just... hang (indefinitely). | 15:46:12 | |
| Thus, the fstab generator will hang (indefinitely) | 15:46:31 | |
https://github.com/systemd/systemd/blob/main/src/core/manager.c#L4272 this one also calls waitpid() internally - with some timeout, that timeout being a lot larger than the timeout in switch-to-configuration. So it first fails on the console and then spits out an error message in the kernel ringbuffer 2min later or so | 15:51:11 | |
| Now, the interesting question is if there's any other code that would access mounted FSes during restart... But that definitely seems to be one of them. | 15:52:17 | |
In reply to @iridium:faui2k11.de The fsck services tend to access file systems too, and I did have issues with those hanging too. But that is unrelated, was failed hardening attempts. Depending on your setup, pam mounts might try to access file systems too - likely not a running session, but if you have unfortunate timing on a login using pam (e.g. ssh) then uhh that might explode? This might especially run into similar issues on an weird ldap based system where users have their home directories essentially as network drives | 23:22:10 | |
| What I find surprising: by your findings, this is basically just systemd+unavailable file systems, how is this not a known bug upstream?? | 23:23:55 | |
| 27 Jan 2025 | ||
| 02:51:09 | ||
Yes, some puzzle piece is missing still I think... I also remember once just stopping NetworkManager and trying out systemd daemon-reexec, which worked. | 07:44:33 | |
| 29 Jan 2025 | ||
| 13:28:34 | ||
| 14:56:45 | ||
| I cannot find clear information online on status of boot counting integration, does someone know the current status ? | 14:58:26 | |
| AFAIK, there was a PR that got merged but it introduced some issues, because our boot entries have user-controlled identifiers in them (specialisation names) and that broke some assumptions of the code that parsed them. So the PR was reverted. I think there are ideas on how to introduce it again without hitting that issue, but no one actually implemented it yet | 16:16:08 | |
In reply to @rvdp:infosec.exchangeWhy not just implement it and throw an exception if users are using specialisations? | 16:54:22 | |
| Assertion | 16:54:37 | |
| thank you ! | 17:47:05 | |
In reply to @matthewcroughan:defenestrate.itThat doesn't sound like a very reasonable comprise to me. There's also no way to know how many users actually rely on specialisations | 19:52:17 | |
| Sure, but at the same time boot counting is super nice. | 20:16:25 | |
| And it could always be fixed later by a minimal PR | 20:16:36 | |
| 30 Jan 2025 | ||
| 04:27:47 | ||
| When was it that systemd initrd would become the default in NixOS?? | 11:25:41 | |