!MthpOIxqJhTgrMNxDS:nixos.org

NixOS ACME / LetsEncrypt

106 Members
Another day, another cert renewal45 Servers

Load older messages


SenderMessageTime
3 Jul 2025
@ctheune:matrix.flyingcircus.ioChristian Theune I'm working on a bit of a refactoring with Arian supervising. I've had a question this morning which I managed to solve with a draft so far. I'm still working on it, but the current state is here: https://github.com/NixOS/nixpkgs/pull/422076. The second commit is currently in draft and needs a further refactoring (and also has a race condition and is likely incomplete), but I have to stop working for today). If you want to take a look, feel free to give feedback. I'm also happy to explain/discuss things face to face if that helps understanding. It's a quite complicated situation and I'm trying to make it cleaner ... 14:48:16
@hexa:lossy.networkhexaan acme-renew unit cannot work, when the initial run did not succeed 🤔 but since the failure of the run might be transient having a combined unit that makes the run vs renew decision makes sense14:57:35
@hexa:lossy.networkhexa* an acme-renew unit on a timer cannot work, when the initial run did not succeed 🤔 but since the failure of the run might be transient having a combined unit that makes the run vs renew decision makes sense14:57:46
@alina:catgirl.cloudalina arielle amelie🏳️‍⚧️🐾 joined the room.16:15:22
@qyliss:fairydust.spaceAlyssa Ross joined the room.16:34:34
@arianvp:matrix.orgArian We could have the .timer have a Requires=acme-order-XX.service then it wont start the timer if the initial run did not succeed 16:56:39
@m1cr0man:m1cr0man.comm1cr0manIf this ends up working, it will probably solve the long standing issue of s-t-c in containers nuking the startup if the network isn't online22:00:48
@m1cr0man:m1cr0man.comm1cr0man* If this ends up working, it will probably solve the long standing issue of boot in containers nuking the startup if the network isn't online22:00:54
4 Jul 2025
@ctheune:matrix.flyingcircus.ioChristian Theune hexa: the combined unit is cause for a lot of complexity with drawbacks, so i'm trying to get it working with separate units. what's the concern that the renew unit won't work? if the order unit fails then that is something that needs to be handled in the order unit... 05:05:55
@ctheune:matrix.flyingcircus.ioChristian Theune m1cr0man: yeah i noticed that the container path shouldn't be special any longer with this change. 05:06:26
@ctheune:matrix.flyingcircus.ioChristian Theunebut i don't have a test / environment that uses this, so happy for feedback.05:06:40
@ctheune:matrix.flyingcircus.ioChristian Theune Arian: yeah, i could upgrade the wants/after to requires, so a failed order unit won't trigger a subsequent renewal failure 06:17:16
@ctheune:matrix.flyingcircus.ioChristian Theune(or well maybe it does, not sure but then it would fail due to a dependency and not an internal failure)06:17:38
@ctheune:matrix.flyingcircus.ioChristian Theuneah but then the "inversion of control" pattern makes it ugly again.06:18:45
@ctheune:matrix.flyingcircus.ioChristian TheuneReminder to self: overall i'm trying to get complexity and the relationships and maybe even the number of units down.06:31:00
@ctheune:matrix.flyingcircus.ioChristian TheuneOne aspect: we basically need one assurance and one signal to interact with certificate consumer units (nginx, postfix, ...):06:31:33
@ctheune:matrix.flyingcircus.ioChristian Theune
  1. the assurance: the files referenced in your config file are now available and are valid ssl certificates. Go forth and start!
06:31:58
@ctheune:matrix.flyingcircus.ioChristian Theune
  1. the signal: the content of the files has changed and you likely want to reload/restart to pick up the new content.
06:32:22
@ctheune:matrix.flyingcircus.ioChristian TheuneThe assurance doesn't really even have to be a valid/current/... acme certificate, but basically something that allows the service to start (e.g. the self signed certificates or maybe even an outdated acme cert).06:32:53
@ctheune:matrix.flyingcircus.ioChristian TheuneWhich is already used for things like allowing bootstrapping the infrastructure to answer HTTP-01.06:33:46
@ctheune:matrix.flyingcircus.ioChristian TheuneNow, we do need some unit that stays active (we used the -finished targets for this previously) so that s-t-c can trigger config updates. This was very indirect previously - moving this to a unit that is oneshot/remainafterexit (the "order" unit in my patch) makes config changes trigger more precisely. 06:35:11
@ctheune:matrix.flyingcircus.ioChristian TheuneHowever, active units can't be triggered by timers, hence a "renew" unit that can be triggered by a timer. Initially I started out with the renew unit just being a "systemctl restart acme-${cert}", having all the tasks in a single bash script. I was somewhat wary of duplicating too much (bash) code (i did duplicate the nix code for the units) so I chose an inversion of control pattern where the renew unit then triggers the order unit again to make sure permission settings, relocating the updated certificates etc. happens in only a single place as well as triggering the reloads for consumers.06:37:21
@ctheune:matrix.flyingcircus.ioChristian Theune(I'm using the chat to talk out my thought processes. Basically just🎈🦆ing here ...)06:39:22
@ctheune:matrix.flyingcircus.ioChristian Theune *
  1. the assurance: the files referenced in your config file are now available and are (syntactically) valid ssl certificates. Go forth and start!
06:39:57
@ctheune:matrix.flyingcircus.ioChristian TheuneThat inversion makes the dependencies muddied again. I could split it up in more units, moving the post-processing code in a separate unit, or just use a shared (execstartpost) script (or partial). 06:42:16
@ctheune:matrix.flyingcircus.ioChristian TheuneHmm. Consolidating multiple certificates renewing at the same time isn't much of an issue I guess as we distribute the renewal timers over time anyway.06:43:05
@ctheune:matrix.flyingcircus.ioChristian Theune* Hmm. Consolidating client reload signals for multiple certificates renewing at the same time isn't much of an issue I guess as we distribute the renewal timers over time anyway.06:43:18
@ctheune:matrix.flyingcircus.ioChristian TheuneSo. I guess two units would suffice: 1st unit (acme-${cert}) is what clients depend via want/after on, which guarantees a syntactically valid certificate is there - which updates the certificate parameters when the config changes. Interestingly, the last part isn't really needed for the assurance itself. 2nd unit to issue ACME renewals. 06:46:40
@ctheune:matrix.flyingcircus.ioChristian TheuneI wonder whether the "update the parameters" (which requires an active unit to trigger selectively) could/should move elsewhere. It can't be merged with the 2nd unit because that conflicts with the timer requirement.06:47:22
@ctheune:matrix.flyingcircus.ioChristian Theune The renewal itself does depend on the order being current/successful, though as hexa noted. 06:48:05

Show newer messages


Back to Room ListRoom Version: 6