| 22 Mar 2025 |
@elvishjerricco:matrix.org | * now, all those units being stopped is a "job canceled" scenario, so they don't need to have already been started for this chain reaction to occur | 09:12:49 |
@elvishjerricco:matrix.org | (some of these observations come from me being in the middle of messing with things and adding various orderings to debug things, so I might be getting the details wrong, but the core idea is I think a problem) | 09:13:41 |
@elvishjerricco:matrix.org | the specific use case I was trying to debug was when / is a tmpfs and /nix is on a partition that needs to grw | 09:14:37 |
@elvishjerricco:matrix.org | * the specific use case I was trying to debug was when / is a tmpfs and /nix is on a partition that needs to grow | 09:14:39 |
@elvishjerricco:matrix.org | but I think it would be a problem in a lot more generic scenarios than that | 09:14:56 |
Arian | Can you make a non-nix-specific reproducer? | 09:15:11 |
@elvishjerricco:matrix.org | uh | 09:15:28 |
@elvishjerricco:matrix.org | I don't know enough about other distros to make another distro do this :P | 09:15:43 |
Arian | I don't understand how you have mounts before repart runs | 09:16:23 |
Arian | You cant resize a mounted partition no? | 09:16:31 |
@elvishjerricco:matrix.org | well, a) yes you can, and b) I don't have mounts anyway. The mount jobs are cancelled before they're started | 09:16:51 |
Arian | Repart should be running before /sysroot is mounted | 09:16:57 |
@elvishjerricco:matrix.org | the device appears, satisfying some dependencies, and then disappears, which causes job cancellations because of BindsTo=dev-foo.device | 09:17:42 |
@elvishjerricco:matrix.org | and then reappears, but then it's too late and damage is done | 09:18:07 |
Arian | Could it be a kernel bug? | 09:18:20 |
Arian | Why is the kernel sending uevents on resize | 09:18:30 |
@elvishjerricco:matrix.org | is it not normal for a device's partitions to be removed and added from udev's perspective when the device is partscanned? | 09:19:01 |
Arian | Well you just said it's possible to resize a partition that is mounted. In that case it doesn't sound like sane behaviour that the underlying device would disappear and appear no | 09:19:50 |
Arian | That makes 0 sense to me | 09:20:04 |
Arian | Ah yeh we use online resize for cloud images etc. I remember now | 09:21:28 |
@elvishjerricco:matrix.org | after repart finishes, I see Changed plugged -> dead for each of the partitions in the systemd debug logging, and then immediately after I see Changed dead -> plugged for them | 09:23:12 |
@elvishjerricco:matrix.org | vda3: Processing udev action (SEQNUM=1400, ACTION=remove)
then I get a
vda: Processing udev action (SEQNUM=1401, ACTION=change)
And then I get
vda3: Processing udev action (SEQNUM=1404, ACTION=add)
| 09:23:57 |
@elvishjerricco:matrix.org | * vda3: Processing udev action (SEQNUM=1400, ACTION=remove)
then I get a
vda: Processing udev action (SEQNUM=1401, ACTION=change)
And then I get
vda3: Processing udev action (SEQNUM=1404, ACTION=add)
| 09:24:15 |
@elvishjerricco:matrix.org | I feel like this can't possibly be right, because then imagine what happens with a normal stage 2 repart service. It would repartition, and immediately cancel all mount jobs depending on those partitions, stopping local-fs.target | 09:26:43 |
Arian | Is Dev/vda3 mounted at this point? | 09:27:35 |
@elvishjerricco:matrix.org | no | 09:27:41 |
Arian | Maybe you only get these events for unmounted partitions | 09:27:59 |
@elvishjerricco:matrix.org | well that would still be the case in stage 2 | 09:28:25 |
@elvishjerricco:matrix.org | for any non-root partitions | 09:28:28 |
Arian | That would explain why it doesn mess with the root fs. But it would indeed screw things up for anything else in fstab | 09:28:34 |