5 Jun 2023 |
oddlama | * Let me restart with the same configuration to see whether it happens every time | 18:27:58 |
@elvishjerricco:matrix.org | maybe systemctl list-dependencies --reverse/--after/--before systemd-networkd.service ? | 18:28:02 |
oddlama | Seems reproducible | 18:30:10 |
oddlama | A notable thing in the logs is that even though | 18:32:14 |
oddlama | * A notable thing in the logs is that even though
Jun 05 20:29:40 sentinel systemd[1]: systemd-networkd.service: Deactivated successfully.
is printed when switching
| 18:32:27 |
oddlama | I get another output from systemd-networkd in stage2 | 18:32:46 |
oddlama | Jun 05 20:29:38 sentinel systemd-networkd[180]: wan: DHCPv6 lease lost | 18:32:49 |
oddlama | No wair | 18:33:12 |
oddlama | * No wait | 18:33:13 |
oddlama | the logs are not in order?! | 18:33:17 |
oddlama | Okay now I'm totally lost. Why would the logs be unordered? | 18:34:05 |
@elvishjerricco:matrix.org | wait wut | 18:36:59 |
@elvishjerricco:matrix.org | maybe systemd-timesyncd is changing the system clock once stage 2 gets going? | 18:37:27 |
oddlama | It's just some select lines | 18:37:47 |
oddlama | Jun 05 20:29:40 sentinel systemd[1]: Starting Journal Service...
Jun 05 20:29:40 sentinel systemd[1]: Starting Load Kernel Modules...
Jun 05 20:29:40 sentinel kernel: fuse: init (API version 7.37)
Jun 05 20:29:40 sentinel systemd[1]: systemd-networkd-wait-online.service: Bound to unit systemd-networkd.service, but unit isn't active.
Jun 05 20:29:40 sentinel systemd[1]: Dependency failed for Wait for Network to be Configured.
Jun 05 20:29:40 sentinel systemd[1]: systemd-networkd-wait-online.service: Job systemd-networkd-wait-online.service/start failed with result 'dependency'.
Jun 05 20:29:40 sentinel systemd[1]: Starting Remount Root and Kernel File Systems...
Jun 05 20:29:40 sentinel systemd[1]: Starting Coldplug All udev Devices...
Jun 05 20:29:40 sentinel systemd[1]: Mounted Huge Pages File System.
Jun 05 20:29:40 sentinel systemd[1]: Mounted POSIX Message Queue File System.
Jun 05 20:29:40 sentinel systemd-journald[730]: Collecting audit messages is disabled.
Jun 05 20:29:40 sentinel systemd[1]: Mounted Kernel Debug File System.
Jun 05 20:29:40 sentinel systemd[1]: Finished Create List of Static Device Nodes.
Jun 05 20:29:40 sentinel systemd[1]: modprobe@configfs.service: Deactivated successfully.
Jun 05 20:29:40 sentinel systemd[1]: Finished Load Kernel Module configfs.
Jun 05 20:29:40 sentinel systemd[1]: modprobe@efi_pstore.service: Deactivated successfully.
Jun 05 20:29:40 sentinel systemd[1]: Finished Load Kernel Module efi_pstore.
Jun 05 20:29:40 sentinel systemd[1]: modprobe@fuse.service: Deactivated successfully.
Jun 05 20:29:40 sentinel systemd[1]: Finished Load Kernel Module fuse.
Jun 05 20:29:40 sentinel systemd-journald[730]: Journal started
Jun 05 20:29:40 sentinel systemd-journald[730]: Runtime Journal (/run/log/journal/25e272da45124b238d42df4ec5fca13c) is 8.0M, max 94.1M, 86.1M free.
Jun 05 20:29:38 sentinel systemd-networkd[180]: wan: DHCPv6 lease lost
Jun 05 20:29:38 sentinel sshd[197]: syslogin_perform_logout: logout() returned an error
Jun 05 20:29:40 sentinel systemd[1]: Started Journal Service.
Jun 05 20:29:39 sentinel systemd[1]: Queued start job for default target Multi-User System.
Jun 05 20:29:39 sentinel systemd[1]: systemd-journald.service: Deactivated successfully.
Jun 05 20:29:40 sentinel systemd[1]: Finished Bind mount or link '/persist/etc/machine-id' to '/etc/machine-id'.
Jun 05 20:29:40 sentinel systemd[1]: Finished Bind mount or link '/persist/etc/ssh/ssh_host_ed25519_key' to '/etc/ssh/ssh_host_ed25519_key'.
Jun 05 20:29:40 sentinel systemd[1]: Finished Bind mount or link '/persist/etc/ssh/ssh_host_ed25519_key.pub' to '/etc/ssh/ssh_host_ed25519_key.pub'.
Jun 05 20:29:40 sentinel systemd[1]: Mounting FUSE Control File System...
Jun 05 20:29:40 sentinel systemd-modules-load[737]: Inserted module 'bridge'
Jun 05 20:29:40 sentinel kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
Jun 05 20:29:40 sentinel systemd[1]: Mounting Kernel Configuration File System...
Jun 05 20:29:40 sentinel systemd[1]: Starting Create Static Device Nodes in /dev...
Jun 05 20:29:40 sentinel systemd[1]: Mounted FUSE Control File System.
Jun 05 20:29:40 sentinel systemd[1]: Mounted Kernel Configuration File System.
Jun 05 20:29:40 sentinel kernel: ACPI: bus type drm_connector registered
Jun 05 20:29:40 sentinel systemd[1]: modprobe@drm.service: Deactivated successfully.
Jun 05 20:29:40 sentinel systemd-modules-load[737]: Inserted module 'macvlan'
Jun 05 20:29:40 sentinel systemd[1]: Finished Load Kernel Module drm.
Jun 05 20:29:40 sentinel systemd-modules-load[737]: Inserted module 'tap'
Jun 05 20:29:40 sentinel kernel: tun: Universal TUN/TAP device driver, 1.6
| 18:38:10 |
oddlama | most are :40 | 18:38:20 |
oddlama | some are :38 or :39 | 18:38:24 |
oddlama | Maybe this is when the journal from stage1 is loaded into to actual journal in stage2? | 18:39:01 |
@elvishjerricco:matrix.org | tbh I don't really understand how the journal is handed over between stages | 18:39:53 |
oddlama | Probably not relevant for this case anyway. Seems to me like networkd is stopped in stage1 and never exists in stage2, the log just seems to be carried over weirdly. | 18:41:12 |
oddlama | Hmm your hunch was right, setting flushBeforeStage2 = true; works around the issue | 18:52:44 |
@elvishjerricco:matrix.org | oddlama: interesting. Can you open an issue and include your configuration? | 18:53:27 |
oddlama | Just noticed that the systemd-networkd.service: Deactivated successfully also happens in stage2 not stage1, the log order is just also wrong there :D | 19:09:29 |
oddlama | In reply to @elvishjerricco:matrix.org oddlama: interesting. Can you open an issue and include your configuration? Not sure how I can include my configuration easily. It's publicly available on github, but too complicated to copy out a single snipped, especially because I don't really know what causes this. I can make a commit and refer to that if it helps | 19:12:40 |
@elvishjerricco:matrix.org | oddlama: Well maybe write a little test with the nixos test framework to reproduce the problem if that proves easy enough to do? If not don't worry about it and just describe the problem as best you can | 19:14:28 |
oddlama | Alright, I wrote a test for this :D | 20:38:56 |
oddlama | https://github.com/NixOS/nixpkgs/issues/236146 | 20:38:57 |
@elvishjerricco:matrix.org | oddlama: If I remove your check-flush service, and wait_for_unit on network.target instead of multi-user.target , the test passes | 20:57:14 |
@elvishjerricco:matrix.org | oh but I see. The point is that I shouldn't need systemd-networkd to be running yet for the dont-flush case to have the network preconfigured by stage 1 | 20:59:00 |
6 Jun 2023 |
@elvishjerricco:matrix.org | oddlama: Ok, so I think the test you made is accidentally discovering a different bug. If you change check-flush.service to have wantedBy = ["multi-user.target"]; instead of requiredBy , then the test succeeds because it doesn't cause multi-user.target to fail, and systemd-networkd.service starts just fine. So the different bug is that the network configuration is getting taken down between stages 1 and two, causing check-flush.service to fail. But systemd-networkd.service is starting fine | 00:39:02 |