NixOS ACME / LetsEncrypt | 105 Members | |
| Another day, another cert renewal | 43 Servers |
| Sender | Message | Time |
|---|---|---|
| 19 Apr 2025 | ||
| 22:49:56 | |
| but that resulted in nginx not starting up | 22:50:03 | |
| because it depends on all the acme-${domain}.service units | 22:50:28 | |
| hm, I thought we were going to set it to just not wait? | 22:52:10 | |
| and we did not set it to anything in nixpkgs | 22:54:08 | |
| but I set it to something on my private infra | 22:54:16 | |
| right | 23:00:12 | |
| I think the current format will only work well when set to not wait at all | 23:00:19 | |
| (which should be fine as the cron job runs often anyway, though we might want to bump it) | 23:00:29 | |
| 21 Apr 2025 | ||
| There was some talk about bumping it when they announced the lower lifetime certs. Wouldn't be the worst thing to do. | 19:18:58 | |
| 22 Apr 2025 | ||
| now 47 days was announced to be the next shorter lifespan | 23:08:50 | |
| and I don't think it warrants trying more than daily for 7-14 days | 23:09:13 | |
| * and I don't think it warrants trying more than daily | 23:09:33 | |
| for 6 days that changes of course | 23:09:45 | |
| 28 Apr 2025 | ||
| https://github.com/NixOS/nixpkgs/pull/376334#pullrequestreview-2801003367 this is ready to go. I tested it too. | 21:26:09 | |
| 29 Apr 2025 | ||
| 23:42:45 | ||
| 5 May 2025 | ||
| hi everyone, does anybody have a workaround that fixes this pesky dns resolution issue when
| 17:59:16 | |
| what seems to be happening is that acme is starting so early that the container is unable to route things yet. Maybe the host has not installed routes yet? but that somehow acme blocks until it times out. | 18:00:34 | |
| once it times out, then everything works fine. I have whittled it down to acme, because when I remove any acme things the container boots up just fine and is able to route/ping quite quickly | 18:01:46 | |
| so, by that, I mean that the issue does not seem to be pertaining to (in my case)
that the behavior occurs. | 18:04:57 | |
what I am most confused about (and why I am posting here) is why the call to lego --accept-tos --path . -d '*.<redacted>' --email <redacted> --key-type ec256 --dns rfc2136 --dns.propagation-disable-ans --dns.resolvers 127.0.0.1:53 --server https://acme-v02.api.letsencrypt.org/directory renew --no-random-sleep --days 30 seems to block all network traffic, even for other services (like wireguard, bind, etc) | 18:07:29 | |
* what I am most confused about (and why I am posting here) is why the call to lego --accept-tos --path . -d '*.<redacted>' --email <redacted> --key-type ec256 --dns rfc2136 --dns.propagation-disable-ans --dns.resolvers 127.0.0.1:53 --server https://acme-v02.api.letsencrypt.org/directory renew --no-random-sleep --days 30 seems to block all network traffic, even for other services (like wireguard, bind, etc) until it times out. There must be something I do not understand about how systemd works or calls this, but I woudl like to learn ;) | 18:07:56 | |
* what I am most confused about (and why I am posting here) is why the call to lego --accept-tos --path . -d '*.<redacted>' --email <redacted> --key-type ec256 --dns rfc2136 --dns.propagation-disable-ans --dns.resolvers 127.0.0.1:53 --server https://acme-v02.api.letsencrypt.org/directory renew --no-random-sleep --days 30 seems to block all network traffic, even for other services (like wireguard, bind, etc) until it times out. There must be something I do not understand about how systemd works or calls this, but I would like to learn ;) | 18:08:04 | |
in essence though, as soon as I comment out the security.acme.certs... config above, the container boots up in a couple seconds, whereas with the acme config in place it takes a couple minutes since it has to wait for acme to timeout. I have tried for days now to figure out how to move the acme renewal process way later, but nothing seems to work. | 18:34:26 | |
* in essence though, as soon as I comment out the security.acme.certs... config above, the container boots up in a couple seconds and can ping various ips and even resolve hostnames with the local BIND instance, whereas with the acme config in place it takes a couple minutes since it has to wait for acme to timeout. In the interim no pinging or hostname lookups even work. I have tried for days now to figure out how to move the acme renewal process way later, but nothing seems to work. | 18:35:53 | |
* in essence though, as soon as I comment out the security.acme.certs... config above, the container boots up in a couple seconds and can ping various ips and even resolve hostnames with the local BIND instance, whereas with the acme config in place it takes a couple minutes to boot since it has to wait for acme to timeout. In the interim no pinging or hostname lookups even work. I have tried for days now to figure out how to move the acme renewal process way later, but nothing seems to work. | 18:48:20 | |
(sorry for so many messages), I have continued to investigate and it seems that the root cause is that the host machine does not provide the network/routes to the container until late (possibly even after?) the container is done booting. So because of this, acme stalls the boot process. So far the only thing that has sort of worked, but is very not-clean, is for me to just put serviceConfig.TimeoutStartSec = "20s"; on the various acme-<domain>.service units. | 20:18:57 | |
| 6 May 2025 | ||
| Sorry - only seeing your messages now. I believe a fix for this does exist in the wild, I vaguely remember running into it a few years ago. Let me do some digging | 20:36:18 | |
| In the mean time netpleb - can you provide the following info from within the container:
| 20:40:38 | |
| Ah, I see you already found the relevant ticket on GitHub. Did you try this fix? | 20:42:27 | |