NixOS ACME / LetsEncrypt | 107 Members | |
| Another day, another cert renewal | 45 Servers |
| Sender | Message | Time |
|---|---|---|
| 17 Dec 2024 | ||
| Could someone please review the fix for cert ownership error message causing an unrelated exception PR, #362271? It's a tiny diff :) Users are getting misleading errors due to this throwing ATM | 23:57:19 | |
| 19 Dec 2024 | ||
K900 I looked at the log for the this failure, httpd only started after the ACME validation happened: Starting Apache HTTPD vs Attempting to validate w/ HTTPI think this is a switch-to-configuration-ng regression 😕The perl script starts all services in a single systemctl call, so a single Systemd transaction. That means httpd's Before relationship with the certs is enforced. Whereas -ng uses the Systemd D-BUS API to start services one by one, meaning multiple transactions. So Before is not enforced. I guess we can try and disable -ng for the ACME tests, see how it goes for a week or so and then potentially raise an issue with -ng. | 01:31:18 | |
| BTW thanks for the review + merge on the PR from above! | 01:39:02 | |
In reply to@thinkchaos:matrix.orgUhh | 06:55:59 | |
| Can you please report this in #NixOS systemd | 06:56:24 | |
| There is no api for starting multiple services in a single transaction. This has always been a lie | 10:46:30 | |
| I think systemctl start also is a for loop around starting single units through dbus afaicr | 10:46:51 | |
| Yeah I need to dig a bit more before I make too much noise, I'll look at systemctl's code, thanks for the hint | 13:38:17 | |
Either way I think we'll need to make the link between the certs and web server stronger to fix this: I'm thinking certs using HTTP validation can Require the relevant web server | 13:45:07 | |
| 21 Dec 2024 | ||
| 06:43:11 | ||
In reply to @arianvp:matrix.orgReally? This completely blows my understanding of service relation chains | 22:43:00 | |
| Yeh pretty sure | 22:43:42 | |
| There is a mutable list of jobs and "dependencies" are some rules that cause some jobs to cancel others out | 22:44:36 | |
| The whole dependency model is kind of a lie | 22:44:45 | |
| https://blog.darknedgy.net/technology/2020/05/02/0/ is a nice read | 22:44:57 | |
| 22 Dec 2024 | ||
| How are we feeling about the acme-setup.service refactor now? https://github.com/NixOS/nixpkgs/pull/355087 I still want to get this merged, it really simplifies the systemd side of things a bit. | 12:31:30 | |
In reply to @thinkchaos:matrix.orgI totally forgot that we had a discussion about this a while ago 😅 tl;dr we could add a target for http01 renewal specifically. The web servers can be configured to want + before on it, and the renewals can require + after. This gives us a generic mechanism of linking whatever web server is running on port 80 to the certs using HTTP01. | 12:36:53 | |
| We do have to be careful about circular dependencies, but that's expected. HTTP01 server startup is complicated regardless. | 12:37:36 | |
In reply to @thinkchaos:matrix.org* I totally forgot that we had a discussion about this a while ago 😅 tl;dr we could add a target for http01 renewal specifically. The web servers can be configured to requiredBy + before on it, and the renewals can require + after. This gives us a generic mechanism of linking whatever web server is running on port 80 to the certs using HTTP01. | 12:41:42 | |
| 13:25:10 | ||
| 15:55:13 | ||
| 27 Dec 2024 | ||
| 07:32:42 | ||
| 30 Dec 2024 | ||
| 16:28:56 | ||
| 31 Dec 2024 | ||
| I don't know what's up with that | 07:24:05 | |
| If there was a change or it's just unlucky | 07:24:12 | |
| But it feels like the tests are flakier now again | 07:24:20 | |
| 1 Jan 2025 | ||
| 14:26:30 | ||
| 12 Jan 2025 | ||
| 12:39:36 | ||
| 19 Jan 2025 | ||
| OK we need to do something | 08:50:49 | |
| The tests are flaking horribly again | 08:50:53 | |