| 19 Feb 2025 |
emily | so actually --ari-wait-to-renew-duration is just a weird footgun…? | 16:51:28 |
emily | so unless I am misreading this Go, to have ARI work properly and not try to renew every single day we would actually need to give it an unlimited timeout. but that doesn't work with how non-ARI certificate renewal paths on a timer work. so lego has just bifurcated lego renew into two entirely different modalities of operation based on server capabilities and then enabled that by default. which seems terrible | 16:52:43 |
Sandro 🐧 | https://datatracker.ietf.org/doc/draft-ietf-acme-ari/#:~:text=Retry%2DAfter%3A%2021600%0A%0A%20%20%20%7B%0A%20%20%20%20%20%22-,suggestedWindow,-%22%3A%20%7B%0A%20%20%20%20%20%20%20%22start%22%3A%20%222021%2D01 | 16:52:52 |
Sandro 🐧 | it allows the acme provider to give you a window where you should renew your cert because they want to go down to shorter lived certs | 16:53:16 |
emily | I promise I do not need ARI explaining to me. I was following the ARI work years ago | 16:53:18 |
emily | this isn't helpful, the discussion is about the interface lego is providing for it | 16:53:28 |
Sandro 🐧 | I just wanted to make sure we are all on the same page, didn't know that you already know everything | 16:54:12 |
emily | this is the sticking point, it doesn't seem like a low --ari-wait-to-renew-duration will actually give you a normal "poll for renewal" interface | 16:54:51 |
emily | it will just look at the recommended renewal and go "nope that's too long" and do it early | 16:55:01 |
emily | (again, based on my quick reading of the Go that could be wrong) | 16:55:07 |
emily | so it seems like we need to let it block indefinitely, which is a total inversion of how our current module works, and we can't even conditionalize on whether certs are using ARI in the Nix code because that's downstream of server-side config | 16:55:37 |
emily | maybe we can just let it wait indefinitely and the timer will only fire once? | 16:55:52 |
emily | this is why ACME really wants a long-lived daemon :( | 16:56:07 |
hexa | sorry, I don't follow your conclusion here | 16:56:37 |
emily | ok, let's say ARI is enabled, the ACME server says "renew in 2 months", but you pass --ari-wait-to-renew-duration 5m | 16:57:22 |
hexa | https://github.com/go-acme/lego/blob/v4.22.2/certificate/renewal.go | 16:57:49 |
emily | oh hmm | 16:57:52 |
hexa | beyond my willingless to sleep | 16:57:52 |
hexa | so returns nil | 16:58:00 |
emily | ok I think I misread ShouldRenewAt | 16:58:01 |
emily | right | 16:58:08 |
emily | ok, then I think we just set it to a time that will definitely not overlap with the next timer. 23h is too long because of our time skewing | 16:58:24 |
emily | I think theoretically you can end up with it running at 23:59 one day and 00:01 the next. not sure how it works exactly | 16:58:57 |
emily | but I guess systemd timers will never start twice at once? | 16:59:01 |
hexa | oh yeah, we do AccuracySecs=14400s | 16:59:03 |
hexa | good call | 16:59:04 |
emily | I'm not sure what Type= we have on the ACME services | 16:59:08 |
hexa | oneshot | 16:59:18 |
emily | oneshots are only considered started after they complete, right? | 16:59:29 |
emily | so the timer can probably start two of them at once? which would be bad. we should probably not be using oneshot | 16:59:40 |