OfBorg | 167 Members | |
| Number of builds and evals in queue: <TBD> | 63 Servers |
| Sender | Message | Time |
|---|---|---|
| 29 Sep 2023 | ||
In reply to @asymmetric:matrix.dapp.org.uk* An interesting idea, but I think the problem with that is that 1) the timeout is set for the nix-build command (so there's no way to change it once the build has started running); and 2) the machine that processes comment commands is separate from the machines that run those commands (builds, evals, etc)... | 16:03:38 | |
In reply to @cole-h:matrix.org yeah i thought that we could either:
i'm not sure i understand the implications of your point 2) though -- couldn't the comments-listening machine direct the building-machine? | 16:06:12 | |
| My point with #2 is that (as far as I know), the coordinator doesn't know which machine took the job, so how should it determine which build machine to inform about the change? (I think the simplest depiction of how it all works, from my knowledge, is GitHub webhook -> ofborg-core (coordinator) -> AMQP server <-> ofborg-eval-X (evaluator and builder)) But the "comment in PR body" could work, if ofborg is notified about that information (it must be, in some roundabout way, because we have access to the PR number)...... | 16:10:11 | |
In reply to @cole-h:matrix.org ofborg-core being the php code running php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:05 | |
In reply to @cole-h:matrix.org* ofborg-core being the php code under ./php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:17 | |
Not solely that. The core machine also runs most of these binaries (that aren't build or eval related): https://github.com/NixOS/ofborg/tree/released/ofborg/src/bin | 19:06:39 | |
| 1 Oct 2023 | ||
| 14:39:06 | ||
| disk full https://github.com/NixOS/nixpkgs/pull/258395/checks?check_run_id=17292775692 | 14:39:14 | |
| https://ofborg.org/prometheus/alerts hmm | 14:41:55 | |
| don't think we monitor the darwin machines in prometheus | 14:43:41 | |
In reply to @hexa:lossy.networkWe do, but I bet they don't push that metric | 14:45:15 | |
| https://ofborg.org/prometheus/graph?g0.expr=node_os_info&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h | 14:45:36 | |
| I don't think we do π | 14:46:07 | |
| Yeah looks like they are only sending ofborg metrics | 14:46:19 | |
| No system metrics | 14:46:23 | |
| oh | 14:46:30 | |
| where did you find a darwin metric? | 14:46:51 | |
| Actually the metrics are only sent from the central queue manager. So yeah they aren't sending anything | 14:47:18 | |
| (for darwin build info) | 14:47:25 | |
| You're right | 14:47:29 | |
| 14:49:17 | ||
In reply to @cafkafk:gitter.imShould be fixed now. | 14:56:40 | |
| Any chance we could run the prometheus agent on darwin cole-h? π₯Ί | 14:57:27 | |
| We do, I guess they're just not scraped | 14:58:43 | |
| 15:00:39 | |
| OK, they're scraped now. | 15:14:17 | |
| the node_filesystem_free metric that is used in the alerting rule does not exist | 15:31:28 | |
Download image.png | 15:31:52 | |
| this one | 15:31:53 | |
| π₯πΆπ₯ | 15:32:09 | |