Sender | Message | Time |
---|---|---|
13 Jun 2023 | ||
emily | I don't want to significantly penalize the common case of just a few domains for that though, or stretch it out to "without manual intervention migrating your NixOS box will result in your sites being offline for the next day" | 20:11:54 |
emily | fundamentally if you want your sites running with TLS you have to spend a certain amount of compute, memory and network to get there | 20:12:15 |
m1cr0man | yep, I'm in full agreement with all of that. I might explore the chained services option to see how it performs and if there's a way to work around the activation delay, with the thought that this solution would be an optional (default off) feature of the module | 20:14:49 |
emily | FWIW, relevant LE rate limits: "The main limit is Certificates per Registered Domain (50 per week)." "You can create a maximum of 300 New Orders per account per 3 hours." "You can have a maximum of 300 Pending Authorizations on your account." | 20:17:11 |
emily | for #1, probably people with tons of certs mostly have them on different domains | 20:17:31 |
emily | #2 means that someone with >300 domains would currently run into rate limits with our existing setup | 20:17:52 |
emily | #3 could theoretically happen if the system chugs enough that the ACME client starts issuing a bunch of certs but doesn't run to completion before more spawn up | 20:18:17 |
emily | of course people with these many certs should probably apply for an exemption anyway, but I think it's good to note the magnitude/timeframe of the upstream limits | 20:18:43 |
m1cr0man | okay yeah, so these are pretty lenient for most people I think I was only concerned about the concurrent one that the ticket opener mentioned:
Right now this one is very easy to do | 20:19:53 |
m1cr0man | * okay yeah, so these are pretty lenient for most people. I think I was only concerned about the concurrent one that the ticket opener mentioned:
Right now this one is very easy to do | 20:20:03 |
emily | ah I missed that one. never skim read! | 20:20:30 |
emily | so yeah my inclination is that it would be good to have something default that ensures we're not issuing certificates at a rate that would surpass that. but preferably not full serialization since that's quite a lot further than that | 20:21:15 |
emily | I feel like there should be a good way to rate limit these services starting without fussing with CPU quotas or whatever. | 20:21:44 |
emily | okay there is | 20:22:08 |
emily | we have StartLimitIntervalSec/StartLimitBurst/StartLimitAction which look perfect. however, I'm guessing that we would need to switch over to @ units to use it - because otherwise all our services are entirely separate | 20:22:45 |
emily | unless it counts the bit after the @ as part of the unit for rate limiting and it's just for making restarts not spam :/ | 20:23:03 |
emily | we need a systemd expert :) | 20:23:22 |
m1cr0man | afaik StartLimit* only applies to services which would enter the failed state? I did consider suggesting that :) however the docs imply it's only for failure. You would need to pair it with Condition/Assert* directives in the unit section, which would be evaluated en masse and actually wouldn't stop concurrency at activation at all | 20:23:50 |
emily | it does say "Configure unit start rate limiting. Units which are started more than burst times within an interval time span are not permitted to start any more." but yeah I'm not sure if it would work | 20:24:32 |
m1cr0man | I was thinking we could use unit retry logic + ConditionPathExists for really easy locking and semaphores | 20:24:44 |
emily | maybe I'm missing some verbiage that applies it's restart-specific but it seems to mostly note that as a side thing? | 20:25:27 |
m1cr0man | afaik "Units which are started" means "for each unit started" rather than "for all units started", so dynamic services would all be individual services and have their own startlimits | 20:25:31 |
emily | but I have a suspicion that it may treat all @ unit instantiations as separate in which case it wouldn't help us anyway. sigh, ACME issuance should really be handled as a daemon | 20:25:52 |
m1cr0man | yarp | 20:26:02 |
m1cr0man | at what point do I just right NixCerts-rs | 20:26:15 |
m1cr0man | * at what point do I just write NixCerts-rs | 20:26:19 |
emily | we are constantly trying to piece together what would be pretty simple logic for a long-running daemon out of paperclips and tape | 20:26:31 |
emily | heh, I don't envy anyone trying to implement ACME from scratch | 20:26:51 |
m1cr0man | ... maybe we need an RFC, to propose a new solution for acme | 20:27:00 |
emily | something with certmagic would probably be pretty easy to do | 20:27:11 |