| 3 Jul 2025 |
Christian Theune | I'm working on a bit of a refactoring with Arian supervising. I've had a question this morning which I managed to solve with a draft so far. I'm still working on it, but the current state is here: https://github.com/NixOS/nixpkgs/pull/422076. The second commit is currently in draft and needs a further refactoring (and also has a race condition and is likely incomplete), but I have to stop working for today). If you want to take a look, feel free to give feedback. I'm also happy to explain/discuss things face to face if that helps understanding. It's a quite complicated situation and I'm trying to make it cleaner ... | 14:48:16 |
hexa | an acme-renew unit cannot work, when the initial run did not succeed 🤔 but since the failure of the run might be transient having a combined unit that makes the run vs renew decision makes sense | 14:57:35 |
hexa | * an acme-renew unit on a timer cannot work, when the initial run did not succeed 🤔 but since the failure of the run might be transient having a combined unit that makes the run vs renew decision makes sense | 14:57:46 |
| alina arielle amelie🏳️⚧️🐾 joined the room. | 16:15:22 |
| Alyssa Ross joined the room. | 16:34:34 |
Arian | We could have the .timer have a Requires=acme-order-XX.service then it wont start the timer if the initial run did not succeed | 16:56:39 |
m1cr0man | If this ends up working, it will probably solve the long standing issue of s-t-c in containers nuking the startup if the network isn't online | 22:00:48 |
m1cr0man | * If this ends up working, it will probably solve the long standing issue of boot in containers nuking the startup if the network isn't online | 22:00:54 |
| 4 Jul 2025 |
Christian Theune | hexa: the combined unit is cause for a lot of complexity with drawbacks, so i'm trying to get it working with separate units. what's the concern that the renew unit won't work? if the order unit fails then that is something that needs to be handled in the order unit... | 05:05:55 |
Christian Theune | m1cr0man: yeah i noticed that the container path shouldn't be special any longer with this change. | 05:06:26 |
Christian Theune | but i don't have a test / environment that uses this, so happy for feedback. | 05:06:40 |
Christian Theune | Arian: yeah, i could upgrade the wants/after to requires, so a failed order unit won't trigger a subsequent renewal failure | 06:17:16 |
Christian Theune | (or well maybe it does, not sure but then it would fail due to a dependency and not an internal failure) | 06:17:38 |
Christian Theune | ah but then the "inversion of control" pattern makes it ugly again. | 06:18:45 |
Christian Theune | Reminder to self: overall i'm trying to get complexity and the relationships and maybe even the number of units down. | 06:31:00 |
Christian Theune | One aspect: we basically need one assurance and one signal to interact with certificate consumer units (nginx, postfix, ...): | 06:31:33 |
Christian Theune |
- the assurance: the files referenced in your config file are now available and are valid ssl certificates. Go forth and start!
| 06:31:58 |
Christian Theune |
- the signal: the content of the files has changed and you likely want to reload/restart to pick up the new content.
| 06:32:22 |
Christian Theune | The assurance doesn't really even have to be a valid/current/... acme certificate, but basically something that allows the service to start (e.g. the self signed certificates or maybe even an outdated acme cert). | 06:32:53 |
Christian Theune | Which is already used for things like allowing bootstrapping the infrastructure to answer HTTP-01. | 06:33:46 |
Christian Theune | Now, we do need some unit that stays active (we used the -finished targets for this previously) so that s-t-c can trigger config updates. This was very indirect previously - moving this to a unit that is oneshot/remainafterexit (the "order" unit in my patch) makes config changes trigger more precisely. | 06:35:11 |
Christian Theune | However, active units can't be triggered by timers, hence a "renew" unit that can be triggered by a timer. Initially I started out with the renew unit just being a "systemctl restart acme-${cert}", having all the tasks in a single bash script. I was somewhat wary of duplicating too much (bash) code (i did duplicate the nix code for the units) so I chose an inversion of control pattern where the renew unit then triggers the order unit again to make sure permission settings, relocating the updated certificates etc. happens in only a single place as well as triggering the reloads for consumers. | 06:37:21 |
Christian Theune | (I'm using the chat to talk out my thought processes. Basically just🎈🦆ing here ...) | 06:39:22 |
Christian Theune | *
- the assurance: the files referenced in your config file are now available and are (syntactically) valid ssl certificates. Go forth and start!
| 06:39:57 |
Christian Theune | That inversion makes the dependencies muddied again. I could split it up in more units, moving the post-processing code in a separate unit, or just use a shared (execstartpost) script (or partial). | 06:42:16 |
Christian Theune | Hmm. Consolidating multiple certificates renewing at the same time isn't much of an issue I guess as we distribute the renewal timers over time anyway. | 06:43:05 |
Christian Theune | * Hmm. Consolidating client reload signals for multiple certificates renewing at the same time isn't much of an issue I guess as we distribute the renewal timers over time anyway. | 06:43:18 |
Christian Theune | So. I guess two units would suffice: 1st unit (acme-${cert}) is what clients depend via want/after on, which guarantees a syntactically valid certificate is there - which updates the certificate parameters when the config changes. Interestingly, the last part isn't really needed for the assurance itself. 2nd unit to issue ACME renewals. | 06:46:40 |
Christian Theune | I wonder whether the "update the parameters" (which requires an active unit to trigger selectively) could/should move elsewhere. It can't be merged with the 2nd unit because that conflicts with the timer requirement. | 06:47:22 |
Christian Theune | The renewal itself does depend on the order being current/successful, though as hexa noted. | 06:48:05 |