!DBFhtjpqmJNENpLDOv:nixos.org

NixOS systemd

600 Members
NixOS ❤️ systemd168 Servers

Load older messages


SenderMessageTime
20 Mar 2025
@arianvp:matrix.orgArianShould the Systemd team be on nixos.org/community 18:34:34
21 Mar 2025
@mrdev023:matrix.orgmrdev023 joined the room.13:51:10
22 Mar 2025
@elvishjerricco:matrix.org@elvishjerricco:matrix.org I am finding some extremely broken behavior with systemd-repart running during boot when the device is already partitioned but needs modification (e.g. grow a partition). It seems that when it runs and repartitions, it causes the device units to stop and start. This causes fsck and mount units to be stopped, all the way up to initrd-fs.target, causing initrd-find-nixos-closure.service to be stopped. Then initrd-parse-etc.service starts and that starts initrd-fs.target again, but initrd-find-nixos-closure.service is still stopped, so it never happens and the system fails to boot 09:10:18
@arianvp:matrix.orgArianWut09:12:19
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgnow, all those units being stopped is a "job canceled", so they don't need to have already been started for this chain reaction to occur09:12:43
@elvishjerricco:matrix.org@elvishjerricco:matrix.org* now, all those units being stopped is a "job canceled" scenario, so they don't need to have already been started for this chain reaction to occur09:12:49
@elvishjerricco:matrix.org@elvishjerricco:matrix.org(some of these observations come from me being in the middle of messing with things and adding various orderings to debug things, so I might be getting the details wrong, but the core idea is I think a problem)09:13:41
@elvishjerricco:matrix.org@elvishjerricco:matrix.org the specific use case I was trying to debug was when / is a tmpfs and /nix is on a partition that needs to grw 09:14:37
@elvishjerricco:matrix.org@elvishjerricco:matrix.org * the specific use case I was trying to debug was when / is a tmpfs and /nix is on a partition that needs to grow 09:14:39
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgbut I think it would be a problem in a lot more generic scenarios than that09:14:56
@arianvp:matrix.orgArianCan you make a non-nix-specific reproducer?09:15:11
@elvishjerricco:matrix.org@elvishjerricco:matrix.orguh09:15:28
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgI don't know enough about other distros to make another distro do this :P09:15:43
@arianvp:matrix.orgArianI don't understand how you have mounts before repart runs09:16:23
@arianvp:matrix.orgArianYou cant resize a mounted partition no?09:16:31
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgwell, a) yes you can, and b) I don't have mounts anyway. The mount jobs are cancelled before they're started09:16:51
@arianvp:matrix.orgArianRepart should be running before /sysroot is mounted09:16:57
@elvishjerricco:matrix.org@elvishjerricco:matrix.org the device appears, satisfying some dependencies, and then disappears, which causes job cancellations because of BindsTo=dev-foo.device 09:17:42
@elvishjerricco:matrix.org@elvishjerricco:matrix.organd then reappears, but then it's too late and damage is done09:18:07
@arianvp:matrix.orgArianCould it be a kernel bug? 09:18:20
@arianvp:matrix.orgArianWhy is the kernel sending uevents on resize09:18:30
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgis it not normal for a device's partitions to be removed and added from udev's perspective when the device is partscanned?09:19:01
@arianvp:matrix.orgArianWell you just said it's possible to resize a partition that is mounted. In that case it doesn't sound like sane behaviour that the underlying device would disappear and appear no09:19:50
@arianvp:matrix.orgArianThat makes 0 sense to me09:20:04
@arianvp:matrix.orgArianAh yeh we use online resize for cloud images etc. I remember now 09:21:28
@elvishjerricco:matrix.org@elvishjerricco:matrix.org after repart finishes, I see Changed plugged -> dead for each of the partitions in the systemd debug logging, and then immediately after I see Changed dead -> plugged for them 09:23:12
@elvishjerricco:matrix.org@elvishjerricco:matrix.org

vda3: Processing udev action (SEQNUM=1400, ACTION=remove)

then I get a

vda: Processing udev action (SEQNUM=1401, ACTION=change)

And then I get

vda3: Processing udev action (SEQNUM=1404, ACTION=add)

09:23:57
@elvishjerricco:matrix.org@elvishjerricco:matrix.org *
vda3: Processing udev action (SEQNUM=1400, ACTION=remove)

then I get a

vda: Processing udev action (SEQNUM=1401, ACTION=change)

And then I get

vda3: Processing udev action (SEQNUM=1404, ACTION=add)
09:24:15
@elvishjerricco:matrix.org@elvishjerricco:matrix.org I feel like this can't possibly be right, because then imagine what happens with a normal stage 2 repart service. It would repartition, and immediately cancel all mount jobs depending on those partitions, stopping local-fs.target 09:26:43
@arianvp:matrix.orgArianIs Dev/vda3 mounted at this point?09:27:35

Show newer messages


Back to Room ListRoom Version: 6