15 Jan 2024 |
| fadenb changed their profile picture. | 11:23:54 |
| fadenb changed their profile picture. | 11:26:25 |
17 Jan 2024 |
| Sumner Evans changed their profile picture. | 05:28:20 |
21 Jan 2024 |
ma27 | In reply to @dandellion:dodsorf.as
Configures dependencies similar to Wants=, but as long as this unit is up, all units listed in Upholds= are started whenever found to be inactive or failed
So I just played around with that and I don't think I like it that much anymore:
- when you have a worker that fails to start (e.g. because of a configuration error), it isn't kept in failed state, but systemd regularly attempts to restart it without a timeout (as it's the case for Restart/StartLimitBurst/StartLimitInterval). This means that it's regularly brought back to the activating state which will probably give you intermittent firing/resolved messages from your monitoring depending on e.g. the sample rate of prometheus and the timing of when the service fails / gets restarted.
- I managed to get one worker to ignore a config change on a deploy (as in I added an
ExecStartPre=exit 1 for testing purposes and the service restarted, but the worker was running fine). I can't really explain what exactly was up there (systemctl cat confirmed my config change and journalctl -t systemd logged a restart of the worker in question), but that's a red flag to me.
I assume BindsTo= +RestartMode=direct may work (but bindsTo requries the latter AFAIU), but I'm also hesitant to do that because I'm not sure if this will have more weird implications.
Also, the more I think about it: synapse should either be able to wait on its remote dependencies on its own or systemd should be able to model dependencies properly here because convering by letting things fail and restart is kinda ugly IMHO and not even reliable: if the router daemon takes a little longer to converge, synapse will reach the restart timeout and then you have the exact same issue despite Upholds!
Is it perhaps possible to let synapse require the routing daemon which will only become active when converged (i.e. Type=notify)?
Also: I think we don't even need (or want?) to restart workers every time synapse itself gets restarted, do we? | 12:44:46 |
26 Jan 2024 |
hexa | hrm, the synapse module also prevents me from using unix domain socket listeners | 16:34:01 |
hexa | supported since 1.89 | 16:34:11 |
hexa | kinda proves my point that nobody reads the release notes 😛 | 16:34:21 |
Dandellion | h7x4: has been working on that in https://github.com/dali99/nixos-matrix-modules/pull/7, the unix socket part is complete afaik | 16:36:53 |
hexa | I kinda want it for the nixos.org homeserver | 16:37:43 |
hexa | would be cool if that could land in the next week or so | 16:37:58 |
Dandellion | I'll ask if the sockets are ready for merge on their own | 16:41:40 |
hexa | thank you! | 16:43:39 |
hexa | 👋 | 16:44:29 |
h7x4 | Hello! | 16:44:34 |
h7x4 | I'll have another look at it :) | 16:44:41 |
hexa | awesome | 16:46:23 |
hexa | feel free to keep me in the loop and request my review where needed | 16:46:51 |
ma27 | are we talking about nixos-matrix-modules or the nixos upstream module?
in case of the latter: anything you want me to do?
(and regarding reviews, feel free to ping me as well :)) | 16:47:12 |
hexa | extending the listenerType in the nixpkgs module | 16:47:51 |
hexa | * extending the listenerType in the nixpkgs module is what is needed | 16:47:56 |
Dandellion | In this case we were talking about nixos-matrix-modules, but we can probably upstream it fairly easily. The type is similar | 16:48:46 |
hexa | https://github.com/NixOS/nixos-org-configurations/pull/336 fwiw | 16:49:55 |
ma27 | anybody interested in doing that? otherwise I can. | 16:49:55 |
hexa | additional eyes welcome | 16:50:10 |
hexa | I'd rather not do it 🙂 | 16:51:17 |
Dandellion | I can do it | 16:51:27 |
1 Feb 2024 |
hexa | In reply to @dandellion:dodsorf.as I can do it hey, did you have time to look into that? | 16:54:57 |
3 Feb 2024 |
hexa | Dandellion? | 21:52:23 |
h7x4 | I think he's a bit busy with FOSDEM atm, but I'll give him an irl ping | 22:19:01 |
Dandellion | ah, sorry, no. I didnt have a chance. I was hoping to do it thursday but other things came up! | 22:27:04 |