!DBFhtjpqmJNENpLDOv:nixos.org

NixOS systemd

582 Members
NixOS ❤️ systemd161 Servers

Load older messages


SenderMessageTime
20 Jan 2025
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgwhich is exactly what I wanted to avoid :P04:13:50
@elvishjerricco:matrix.org@elvishjerricco:matrix.org

Ok. I have things that work now. I'm going to summarize the problems now if only to make sure I've got them all in line in my head :P

  • systemd's switch-root is frustrating
    • It will only serialize state and hand it over if the new PID1 is the builtin path (/run/current-system/systemd/lib/systemd/systemd, and the empty string counts as equivalent).
    • It will check for the existence of the new PID1 binary before it does its switch_root function.
    • This switch_root function is the one that bind-mounts /run, meaning if the new init is in /run then you just won't be allowed to switch-root.
    • Also when switch_root does the bind mounts, it only does them if there's not already a mount there.
    • We (inadvertently, I think) resolved this by bind mounting /run ourselves for initrd-nixos-activation.service.
  • Now, the reason credentials weren't being "imported" in stage 2 is because systemd expects them to have already been imported in stage 1. Stage 1 was importing, but we were't bind mounting /run recursively.
    • Because switch_root skips already-mounted mounts, it also skipped this.
    • Result, imported credentials are killed.
  • All this means that we have to setup /sysroot/run before we switch-root, even though we want switch-root to be the thing setting up /sysroot/run for us.

UGH

04:50:54
@elvishjerricco:matrix.org@elvishjerricco:matrix.org *

Ok. I have things that work now. I'm going to summarize the problems now if only to make sure I've got them all in line in my head :P

  • systemd's switch-root is frustrating

    • It will only serialize state and hand it over if the new PID1 is the builtin path (/run/current-system/systemd/lib/systemd/systemd, and the empty string counts as equivalent).
    • It will check for the existence of the new PID1 binary before it does its switch_root function.
    • This switch_root function is the one that bind-mounts /run, meaning if the new init is in /run then you just won't be allowed to switch-root, because the previous step will have failed before getting here.
    • Also when switch_root does the bind mounts, it only does them if there's not already a mount there.
    • We (inadvertently, I think) resolved this by bind mounting /run ourselves for initrd-nixos-activation.service.
  • Now, the reason credentials weren't being "imported" in stage 2 is because systemd expects them to have already been imported in stage 1. Stage 1 was importing, but we were't bind mounting /run recursively.

    • Because switch_root skips already-mounted mounts, it also skipped this.
    • Result, imported credentials are killed.
  • All this means that we have to setup /sysroot/run before we switch-root, even though we want switch-root to be the thing setting up /sysroot/run for us.

UGH

04:52:00
@elvishjerricco:matrix.org@elvishjerricco:matrix.org Additionally, there's a related problem for trying to eliminate specialFileSystems. Activation expects that some things in /sys and /proc are mounted too, not just /run, so now also have to setup those! Now, I think those can be temporary, but it's still something I wish was just handled by systemd. 04:55:43
@elvishjerricco:matrix.org@elvishjerricco:matrix.org Oh I have a bad idea. A really bad idea. We could solve all of this by switch-rooting into a system that's almost completely unconfigured, except for one unit that runs activation in the real, current root, and then does a soft-reboot into the real system 04:58:31
@phaer:matrix.orgphaerWait, why would you need a soft-reboot here? Shouldn't you end up with a working system after activation - similar to if you just run activation in an already booted system, i.e. during nixos-rebuild? What am I missing?16:57:46
@phaer:matrix.orgphaerMight actually give this a try later today/tomorrow to find out :D 16:58:00
@elvishjerricco:matrix.org@elvishjerricco:matrix.org phaer: You'd do the soft-reboot just to avoid the complex process of switch-to-configuration 18:43:21
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgit's a beast that really shouldn't be part of bootup18:43:28
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgfunny thought I just had about that idea. The intermediate phase would be like a stage 1.5, except it's more like stage 2 because it actually exists in the stage 2 rootfs, so maybe more like stage 2.-5? :P18:45:08
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgImagine trying to explain to upstream systemd "yea, this error happens on nixos during stage two and a negative half"18:45:54
@iridium:faui2k11.deiridium @elvishjerricco:matrix.org: My notebook just crashed during upgrade - again, systemd restart failed. I now have a system in defunct state, but do have a shell. What useful things should I look at to collect more data for debugging? 🙂 18:59:35
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgoh gosh18:59:58
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgI need to remind myself of your exact issue again19:00:06
@iridium:faui2k11.deiridiumhttps://discourse.nixos.org/t/system-inoperable-after-automatic-upgrades/50197/219:02:16
@elvishjerricco:matrix.org@elvishjerricco:matrix.org iridium: Are you able to open journalctl -e? 19:03:05
@iridium:faui2k11.deiridiumYes: https://pastebin.com/raw/7Yz2drXP19:05:09
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgok good. And you would expect that downgrading and redoing the upgrade would trigger it again, right?19:05:25
@iridium:faui2k11.deiridiumNot sure if relevant: https://pastebin.com/raw/GSkBauCk19:05:58
@iridium:faui2k11.deiridiumI have to admit I never tried, but would guess so, yes19:06:14
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgoh if you know how to use gdb productively, that could be useful :P I am at level zero with that stuff19:06:51
@iridium:faui2k11.deiridiumthat specific stacktrace couldn't be less interesting tbh19:07:16
@elvishjerricco:matrix.org@elvishjerricco:matrix.org if it does, it'd be good to try booting the old generation, but adding systemd.log_level=debug to the kernel params from your boot menu, and then doing the upgrade 19:07:39
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgthose journal logs could be much more useful19:07:46
@iridium:faui2k11.deiridiumAnything I should do with the machine right now, in case I don't manage to get it into exactly the same state again afterwards?19:08:11
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgnot that I can think of unfortunately.19:08:33
@elvishjerricco:matrix.org@elvishjerricco:matrix.org iridium: last time you said you could reproduce it if you had an NFS and/or an SSHFS mounted. Was that the case this time? 19:11:00
@iridium:faui2k11.deiridiumNFS yes, sshfs no19:11:09
@iridium:faui2k11.deiridiumNFS over wireguard, to be specific19:11:16
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgwhere is the NFS mounted?19:11:58

Show newer messages


Back to Room ListRoom Version: 6