!MthpOIxqJhTgrMNxDS:nixos.org

NixOS ACME / LetsEncrypt

86 Members
Another day, another cert renewal39 Servers

Load older messages


SenderMessageTime
9 Jun 2023
@emilazy:matrix.orgemilydespite all the headaches lego has caused I would like to spend a moment of thanks for the fact that we did not move to anything shell-based: https://github.com/acmesh-official/acme.sh/issues/465910:24:43
@emilazy:matrix.orgemilyah I see this already came up in the security channel11:19:04
13 Jun 2023
@m1cr0man:m1cr0man.comm1cr0man Hello again :) Busy few weeks... looking into https://github.com/NixOS/nixpkgs/issues/232505 again. I just had a notion - could we chain all the certs together with an After= condition? We would still need to avoid auto-starting the services for each cert (otherwise config switch would take a REALLY long time) but that might be easy to solve with a target. 19:44:52
@m1cr0man:m1cr0man.comm1cr0man
In reply to @emilazy:matrix.org
despite all the headaches lego has caused I would like to spend a moment of thanks for the fact that we did not move to anything shell-based: https://github.com/acmesh-official/acme.sh/issues/4659
Oh wow.. that's spooky. At least if we were using that our systemd services for renewal are hardened like steel
19:45:33
@emilazy:matrix.orgemilyonly so much hardening you can do when the process has access to private keys :(20:03:36
@emilazy:matrix.orgemily(ideally you have privilege separation so that the process that talks to the ACME server doesn't have access to the keys but I don't think even lego does that)20:05:44
@emilazy:matrix.orgemily
In reply to @m1cr0man:m1cr0man.com
Hello again :) Busy few weeks... looking into https://github.com/NixOS/nixpkgs/issues/232505 again. I just had a notion - could we chain all the certs together with an After= condition? We would still need to avoid auto-starting the services for each cert (otherwise config switch would take a REALLY long time) but that might be easy to solve with a target.
honestly I don't know if there's a one-size-fits-all solution to this. we can randomize renewal time because it fundamentally doesn't matter when renewal happens as long as it's sufficiently far in advance. some users will want their sites accessible as soon as possible after setting up a new box or activating a new configuration; some will be worried about load and rate limits. i don't see how we can satisfy both out of the box
20:06:49
@emilazy:matrix.orgemily the "This will cause the timer to start; and after 1 second start all the services with a randomised delay." idea sounds nice enough - but then we're talking about, your sites have broken SSL for up to an entire day? 20:07:23
@emilazy:matrix.orgemilyI'm curious how Caddy/certmagic handles this since it has pretty sophisticated logic for cert issue timing20:08:08
@m1cr0man:m1cr0man.comm1cr0man Could you let me know what you find from that? But to your point about one size fits all, it seems like we will need to introduce an option for users to decide what they want. We can default to the current situation, but provide an option like renewOnActivate for other situations? 20:09:44
@emilazy:matrix.orgemily I'm tempted to say that people can just poke at the systemd.* options themselves if they really want rate limiting, but I'm biased :p 20:10:27
@emilazy:matrix.orgemilyI would consider it acceptable to do something out of the box if we found a solution that leads to large numbers of certs being activated in minutes rather than hours/days though20:10:48
@emilazy:matrix.orgemilyif you have dozens/hundreds of certs then you're probably expecting initial setup to take about that long20:11:28
@emilazy:matrix.orgemilyI don't want to significantly penalize the common case of just a few domains for that though, or stretch it out to "without manual intervention migrating your NixOS box will result in your sites being offline for the next day"20:11:54
@emilazy:matrix.orgemilyfundamentally if you want your sites running with TLS you have to spend a certain amount of compute, memory and network to get there20:12:15
@m1cr0man:m1cr0man.comm1cr0manyep, I'm in full agreement with all of that. I might explore the chained services option to see how it performs and if there's a way to work around the activation delay, with the thought that this solution would be an optional (default off) feature of the module20:14:49
@emilazy:matrix.orgemilyFWIW, relevant LE rate limits: "The main limit is Certificates per Registered Domain (50 per week)." "You can create a maximum of 300 New Orders per account per 3 hours." "You can have a maximum of 300 Pending Authorizations on your account."20:17:11
@emilazy:matrix.orgemilyfor #1, probably people with tons of certs mostly have them on different domains20:17:31
@emilazy:matrix.orgemily#2 means that someone with >300 domains would currently run into rate limits with our existing setup20:17:52
@emilazy:matrix.orgemily#3 could theoretically happen if the system chugs enough that the ACME client starts issuing a bunch of certs but doesn't run to completion before more spawn up20:18:17
@emilazy:matrix.orgemilyof course people with these many certs should probably apply for an exemption anyway, but I think it's good to note the magnitude/timeframe of the upstream limits20:18:43
@m1cr0man:m1cr0man.comm1cr0man

okay yeah, so these are pretty lenient for most people I think I was only concerned about the concurrent one that the ticket opener mentioned:

the “new-nonce”, “new-account”, “new-order”, and “revoke-cert” endpoints on the API have an Overall Requests limit of 20 per second.

Right now this one is very easy to do

20:19:53
@m1cr0man:m1cr0man.comm1cr0man *

okay yeah, so these are pretty lenient for most people. I think I was only concerned about the concurrent one that the ticket opener mentioned:

the “new-nonce”, “new-account”, “new-order”, and “revoke-cert” endpoints on the API have an Overall Requests limit of 20 per second.

Right now this one is very easy to do

20:20:03
@emilazy:matrix.orgemilyah I missed that one. never skim read!20:20:30
@emilazy:matrix.orgemilyso yeah my inclination is that it would be good to have something default that ensures we're not issuing certificates at a rate that would surpass that. but preferably not full serialization since that's quite a lot further than that20:21:15
@emilazy:matrix.orgemilyI feel like there should be a good way to rate limit these services starting without fussing with CPU quotas or whatever.20:21:44
@emilazy:matrix.orgemily okay there is 20:22:08
@emilazy:matrix.orgemilywe have StartLimitIntervalSec/StartLimitBurst/StartLimitAction which look perfect. however, I'm guessing that we would need to switch over to @ units to use it - because otherwise all our services are entirely separate20:22:45
@emilazy:matrix.orgemilyunless it counts the bit after the @ as part of the unit for rate limiting and it's just for making restarts not spam :/20:23:03
@emilazy:matrix.orgemilywe need a systemd expert :)20:23:22

Show newer messages


Back to Room ListRoom Version: 6