!MthpOIxqJhTgrMNxDS:nixos.org

NixOS ACME / LetsEncrypt

107 Members
Another day, another cert renewal43 Servers

Load older messages


SenderMessageTime
27 Nov 2024
@hexa:lossy.networkhexa
root      130934  0.0  0.0   6620  3328 ?        Ss   Nov25   0:00 /nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash /nix/store/kia1z8g0zv7w2ndbr6bf88ybgacjldi1-acme-postrun
root      130936  0.0  0.1  18628  7424 ?        S    Nov25   0:00  \_ systemctl reload nginx
root      131100  0.0  0.0   6620  3328 ?        Ss   Nov25   0:00 /nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash /nix/store/x8pmg6g602b92rrbapxpcmb695n811lb-acme-postrun
root      131103  0.0  0.1  18628  7552 ?        S    Nov25   0:00  \_ systemctl reload nginx
root      143742  0.0  0.0   6620  3328 ?        Ss   Nov26   0:00 /nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash /nix/store/kr92xf7yisa0236ls1gmacqwapc3zqxz-acme-postrun
root      143745  0.0  0.1  18628  7424 ?        S    Nov26   0:00  \_ systemctl reload nginx
01:53:50
@hexa:lossy.networkhexa
#!/nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash
cd /var/lib/acme/lossy.network
if [ -e renewed ]; then
  rm renewed
  systemctl reload nginx

  
fi
01:54:14
@hexa:lossy.networkhexarenewed does not exist anymore01:54:21
@hexa:lossy.networkhexarebooted the machine and reloading works again02:02:59
@hexa:lossy.networkhexajust wrote this down so maybe if someone else hits it we'll know it is a recurring thing? 😄 02:03:15
28 Nov 2024
@hexa:lossy.networkhexaok, just saw it on the next host 🫠01:40:56
@m1cr0man:m1cr0man.comm1cr0manthat's... weird22:01:25
@m1cr0man:m1cr0man.comm1cr0mandoes the nginx service show one of the ExecReload processes as running?22:01:49
@m1cr0man:m1cr0man.comm1cr0man I got the --replace-cert-domains --overwrite-domains --force-cert-domains PR merged to lego 😄 once it has been published in a released version, and the setup process refactor has merged, I have a patch set ready to remove the domain hash entirely (test suite remains unchanged + passes, and this won't trigger mass renewal). 22:12:03
@sandro:supersandro.deSandro 🐧don't forget to drop it from the test22:13:51
@sandro:supersandro.deSandro 🐧does that mean we renew all certs again ? 😅22:14:09
@m1cr0man:m1cr0man.comm1cr0manNo, this does not affect the certDir hash, and thus will not trigger mass renewals. I'm not sure what you're referring to in the test suite? It does not make reference to the domain hash, and the existing tests are still valid + important.22:15:36
@m1cr0man:m1cr0man.comm1cr0man * I got the --replace-cert-domains --overwrite-domains --force-cert-domains PR merged to lego 😄 once it has been published in a released version, and the setup process refactor has merged, I have a patch set ready to remove the domain hash entirely (test suite remains unchanged + passes, and this won't trigger mass renewal). 23:40:13
29 Nov 2024
@hexa:lossy.networkhexa m1cr0man: https://gist.githubusercontent.com/mweinelt/b27e2353eedc99242a1074a5d2a4e85f/raw/2475ed5c4863f763f8e0e3938dcc2567a633d67d/gistfile1.txt 02:12:12
@hexa:lossy.networkhexaon release-24.1102:12:14
@hexa:lossy.networkhexa * on release-24.11 on hydra02:12:19
@hexa:lossy.networkhexa
In reply to @m1cr0man:m1cr0man.com
does the nginx service show one of the ExecReload processes as running?

no

● nginx.service - Nginx Web Server
     Loaded: loaded (/etc/systemd/system/nginx.service; enabled; preset: ignored)
     Active: active (running) since Mon 2024-11-18 02:14:41 UTC; 1 week 4 days ago
 Invocation: ae7b61c9c68d4f04bfe52186b992b0b2
    Process: 1293 ExecStartPre=/nix/store/0myprdwjj6jjkpd72r1k8qv7fxqnivkp-unit-script-nginx-pre-start/bin/nginx-pre-start (code=exited, status=0/SUCCESS)
    Process: 166807 ExecReload=/nix/store/m489ix7hrxznh7a5fmdsijdlq4x6p5nn-nginx-1.26.2/bin/nginx -c /nix/store/8nw18skbb0ic2wznfa652nh0xpdn4s5a-nginx.conf -t (code=exited, status=0/SUCCESS)
    Process: 166808 ExecReload=/nix/store/k48bha2fjqzarg52picsdfwlqx75aqbb-coreutils-9.5/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 1297 (nginx)
         IP: 18.4M in, 80.2M out
         IO: 21.3M read, 12.7M written
      Tasks: 2 (limit: 4553)
     Memory: 9.4M (peak: 13.6M)
        CPU: 32.430s
     CGroup: /system.slice/nginx.service
             ├─  1297 "nginx: master process /nix/store/m489ix7hrxznh7a5fmdsijdlq4x6p5nn-nginx-1.26.2/bin/nginx -c /nix/store/8nw18skbb0ic2wznfa652nh0xpdn4s5a-nginx.conf"
             └─166814 "nginx: worker process"
02:17:04
@hexa:lossy.networkhexait is also not a race, i think02:17:19
@hexa:lossy.networkhexabecause I see it on a third host with just a single unit stuck right now 😩 02:17:33
@hexa:lossy.networkhexa
● acme-spam.lossy.network.service - Renew ACME certificate for spam.lossy.network
     Loaded: loaded (/etc/systemd/system/acme-spam.lossy.network.service; enabled; preset: ignored)
     Active: activating (start-post) since Mon 2024-11-25 12:40:50 UTC; 3 days ago
 Invocation: de04e9b6744344cf86391a79578da590
TriggeredBy: ● acme-spam.lossy.network.timer
    Process: 180885 ExecStart=/nix/store/h9h8x07dqvzp9652p05jq8zs2205q66v-unit-script-acme-spam.lossy.network-start/bin/acme-spam.lossy.network-start (code=exited, status=0/SUCCESS)
   Main PID: 180885 (code=exited, status=0/SUCCESS); Control PID: 180907 (vw98jrrshrz371a)
         IP: 23.3K in, 12.4K out
         IO: 52.1M read, 8K written
      Tasks: 2 (limit: 4553)
     Memory: 1.8M (peak: 105M)
        CPU: 577ms
     CGroup: /system.slice/acme-spam.lossy.network.service
             ├─180907 /nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash /nix/store/vw98jrrshrz371az32ssbcwrr3bz2fqs-acme-postrun
             └─180909 systemctl reload nginx

Nov 25 12:41:01 helios acme-spam.lossy.network-start[180900]: 'certificates/spam.lossy.network.crt' -> 'out/fullchain.pem'
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + cp -vp certificates/spam.lossy.network.key out/key.pem
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180901]: 'certificates/spam.lossy.network.key' -> 'out/key.pem'
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + cp -vp certificates/spam.lossy.network.issuer.crt out/chain.pem
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180902]: 'certificates/spam.lossy.network.issuer.crt' -> 'out/chain.pem'
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + ln -sf fullchain.pem out/cert.pem
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + cat out/key.pem out/fullchain.pem
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + chmod 640 out/cert.pem out/chain.pem out/fullchain.pem out/full.pem out/key.pem out/renewed
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: + echo 'Releasing lock /run/acme/2.lock'
Nov 25 12:41:01 helios acme-spam.lossy.network-start[180885]: Releasing lock /run/acme/2.lock
02:17:49
@hexa:lossy.networkhexa (yes, the domain is spam.lossy.network, it is where I send all the spam from host my rspam dashboarda t) 02:18:14
@hexa:lossy.networkhexa * (yes, the domain is spam.lossy.network, it is where I send all the spam from host my rspam dashboard at) 02:18:24
@hexa:lossy.networkhexaI'm leaving it in this state for now, so we can check this out tomorrow or on the weekend02:19:16
@hexa:lossy.networkhexathe cert expires on december 26th, so we have some time 😄 02:19:32
@hexa:lossy.networkhexa
In reply to @hexa:lossy.network
m1cr0man: https://gist.githubusercontent.com/mweinelt/b27e2353eedc99242a1074a5d2a4e85f/raw/2475ed5c4863f763f8e0e3938dcc2567a633d67d/gistfile1.txt
builds on my machine(TM)
02:28:01
@hexa:lossy.networkhexaworked on the third try on hydra02:34:51
@thinkchaos:matrix.orgThinkChaos

It's not stuck on "Releasing lock", that process has exited: Main PID: 180885 (code=exited, status=0\/SUCCESS)
The code actually does nothing after printing that, the script just exits which automatically frees the lock (source)

Based on the CGroup content I think it's stuck on reloading Nginx though I don't understand how that would block or why it's doing that, as Nginx is supposed to reload itself through nginx-config-reload.service.
What's the content of /nix/store/vw98jrrshrz371az32ssbcwrr3bz2fqs-acme-postrun?
What's the value of services.nginx.enableReload? Did you add nginx to the cert's reloadServices?

03:23:40
@hexa:lossy.networkhexa
#!/nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash
cd /var/lib/acme/spam.lossy.network
if [ -e renewed ]; then
  rm renewed
  systemctl reload nginx

  
fi
12:38:15
@hexa:lossy.networkhexa *
#!/nix/store/0irlcqx2n3qm6b1pc9rsd2i8qpvcccaj-bash-5.2p37/bin/bash
cd /var/lib/acme/spam.lossy.network
if [ -e renewed ]; then
  rm renewed
  systemctl reload nginx # <-- stuck here
fi
12:38:32
@hexa:lossy.networkhexa I have not set enableReload 12:39:10

Show newer messages


Back to Room ListRoom Version: 6