NixOS ACME / LetsEncrypt - Public Room Timeline - Matrix Static

	NixOS ACME / LetsEncrypt	111 Members
	Another day, another cert renewal	47 Servers

You have reached the beginning of time (for this room).

Sender	Message	Time
6 May 2025
m1cr0man	In the mean time netpleb - can you provide the following info from within the container: Logs of acme-$cert.service redacted as necessary Output of `systemctl list-dependencies acme-$cert.service` Output of `systemctl list-dependencies bind.service`	20:40:38
m1cr0man	Ah, I see you already found the relevant ticket on GitHub. Did you try this fix?	20:42:27
netpleb	Thanks for your reply and for helping to figure this out. I did try that fix that you mentioned, as well as this one but neither have done the trick. I will the output of those commands for you now.	21:03:08
netpleb	here's the redacted output (first time using local instance of ollama to do redacting!): [root@hostname:~]# systemctl list-dependencies acme-example.com.service acme-example.com.service ○ ├─acme-selfsigned-example.com.service ● ├─acme-setup.service ● ├─bind.service ○ ├─dns-rfc2136-conf.service ○ ├─nginx-config-reload.service ● ├─system.slice ● ├─acme-account-8bbd8b2b5078a14c2103.target ○ │ └─acme-jitsi.example.com.service ● ├─network-online.target ● ├─nss-lookup.target ● │ └─nscd.service ● └─sysinit.target ● ├─dev-hugepages.mount ● ├─dev-mqueue.mount ● ├─firewall.service ○ ├─kmod-static-nodes.service ○ ├─suid-sgid-wrappers.service ○ ├─sys-fs-fuse-connections.mount ○ ├─sys-kernel-debug.mount ○ ├─sys-kernel-tracing.mount ● ├─systemd-ask-password-console.path ○ ├─systemd-boot-random-seed.service ○ ├─systemd-hibernate-clear.service ○ ├─systemd-journal-catalog-update.service ● ├─systemd-journal-flush.service ● ├─systemd-journald.service ○ ├─systemd-machine-id-commit.service ○ ├─systemd-modules-load.service ○ ├─systemd-pstore.service ○ ├─systemd-random-seed.service ● ├─systemd-resolved.service ● ├─systemd-sysctl.service ● ├─systemd-tmpfiles-setup-dev-early.service ● ├─systemd-tmpfiles-setup-dev.service ● ├─systemd-tmpfiles-setup.service ○ ├─systemd-tpm2-setup-early.service ○ ├─systemd-tpm2-setup.service ○ ├─systemd-udevd.service ○ ├─systemd-update-done.service ● ├─systemd-update-utmp.service ● ├─cryptsetup.target ● ├─local-fs.target ○ │ └─systemd-remount-fs.service ● └─swap.target [root@hostname:~]# systemctl list-dependencies bind.service bind.service ○ ├─dns-fix-zone-perms.service ○ ├─dns-rfc2136-conf.service ● ├─system.slice ● └─sysinit.target ● ├─dev-hugepages.mount ● ├─dev-mqueue.mount ● ├─firewall.service ○ ├─kmod-static-nodes.service ○ ├─suid-sgid-wrappers.service ○ ├─sys-fs-fuse-connections.mount ○ ├─sys-kernel-debug.mount ○ ├─sys-kernel-tracing.mount ● ├─systemd-ask-password-console.path ○ ├─systemd-boot-random-seed.service ○ ├─systemd-hibernate-clear.service ○ ├─systemd-journal-catalog-update.service ● ├─systemd-journal-flush.service ● ├─systemd-journald.service ○ ├─systemd-machine-id-commit.service ○ ├─systemd-modules-load.service ○ ├─systemd-pstore.service ○ ├─systemd-random-seed.service ● ├─systemd-resolved.service ● ├─systemd-sysctl.service ● ├─systemd-tmpfiles-setup-dev-early.service ● ├─systemd-tmpfiles-setup-dev.service ● ├─systemd-tmpfiles-setup.service ○ ├─systemd-tpm2-setup-early.service ○ ├─systemd-tpm2-setup.service ○ ├─systemd-udevd.service ○ ├─systemd-update-done.service ● ├─systemd-update-utmp.service ● ├─cryptsetup.target ● ├─local-fs.target ○ │ └─systemd-remount-fs.service ● └─swap.target [root@hostname:~]#	21:09:35
m1cr0man	Interesting. FWIW, I personally used to use Bind + RFC2136 for renewals. It was not in a container though. The service ordering looks correct, with bind listed as a dependency of acme-example.com.service.	21:27:43
m1cr0man	What error is lego itself throwing during renewal?	21:28:22
netpleb	Thanks. Yes, I think it probably works fine when not in a container, but alas my use case is within a container :-/. I will get the exact error for you in a moment, but in essence it is something like this: `Could not create client: get directory at ‘https://acme-v02.api.letsencrypt.org/directory’: Get “https://acme-v02.api.letsencrypt.org/directory”: dial tcp: lookup acme-v02.api.letsencrypt.org 1: Temporary failure in name resolution` it tries to do that 6 times I think before timing out. Interestingly, during this process though I cannot ping anything (much less lookup host names).	21:34:17
netpleb	But this is why it seems to be something weird about how the host deals with the container...I think what happens is that when the acme stuff is present in the container, it causes the boot process for the container to be drawn way out longer than it should be (hence why we are discussing here), but because the boot process is drawn out the container has not reached whatever stage it is supposed to get to for the host to install the routes.	21:36:05
m1cr0man	I'll see if I can put together a test suite for this when I next get a moment to investigate it. Not sure what the problem is right now, sorry I can't be more help	22:08:20
netpleb	Thanks for looking into it. It was driving me mad, so I stopped yesterday after putting the non-clean solution of a 20s start timeout on the acme services.	22:11:19
m1cr0man	It might be worth poking around with resolvectl/systemd-resolved and see if something fishy is happening. The nspawn containers do funky things with the hosts file and nameserver setup, could be conflicting with bind	22:14:41
netpleb	thanks, I've been poking at that a bit. Will let you know if anything comes of it.	22:36:06
9 May 2025
netpleb	I have good news!! The issue is finally resolved. It turned out to be a much different problem than originally expected: ipv6 link local addressing was the cuplrit. Even though I had `networking.enableIPv6 = false` on both the host and the machine, `systemd-network-wait-online` was not reaching its target because `systemd-network` was trying to assign link local ipv6 addresses. Setting `systemd.network.networks."eth0".networkConfig.LinkLocalAddressing = "no";` in my container config seemed to do the trick.	21:47:12
netpleb	* I have good news!! The issue is finally resolved. It turned out to be a much different problem than originally expected: ipv6 link local addressing was the cuplrit. Even though I had `networking.enableIPv6 = false` on both the host and the container, `systemd-network-wait-online` was not reaching its target because `systemd-network` was trying to assign link local ipv6 addresses. Setting `systemd.network.networks."eth0".networkConfig.LinkLocalAddressing = "no";` in my container config seemed to do the trick.	21:54:28
10 May 2025
Arian	you can also configure systemd-network-wait-online to wait for either ipv4 or ipv6	07:19:36

Show newer messages

Back to Room ListRoom Version: 6