| 29 Sep 2023 |
@asymmetric:matrix.dapp.org.uk | In reply to @cole-h:matrix.org My point with #2 is that (as far as I know), the coordinator doesn't know which machine took the job, so how should it determine which build machine to inform about the change? (I think the simplest depiction of how it all works, from my knowledge, is GitHub webhook -> ofborg-core (coordinator) -> AMQP server <-> ofborg-eval-X (evaluator and builder)) But the "comment in PR body" could work, if ofborg is notified about that information (it must be, in some roundabout way, because we have access to the PR number)...... ofborg-core being the php code running php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:05 |
@asymmetric:matrix.dapp.org.uk | In reply to @cole-h:matrix.org My point with #2 is that (as far as I know), the coordinator doesn't know which machine took the job, so how should it determine which build machine to inform about the change? (I think the simplest depiction of how it all works, from my knowledge, is GitHub webhook -> ofborg-core (coordinator) -> AMQP server <-> ofborg-eval-X (evaluator and builder)) But the "comment in PR body" could work, if ofborg is notified about that information (it must be, in some roundabout way, because we have access to the PR number)...... * ofborg-core being the php code under ./php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:17 |
cole-h | Not solely that. The core machine also runs most of these binaries (that aren't build or eval related): https://github.com/NixOS/ofborg/tree/released/ofborg/src/bin | 19:06:39 |
| 1 Oct 2023 |
| cafkafk joined the room. | 14:39:06 |
cafkafk | disk full https://github.com/NixOS/nixpkgs/pull/258395/checks?check_run_id=17292775692 | 14:39:14 |
hexa | https://ofborg.org/prometheus/alerts hmm | 14:41:55 |
hexa | don't think we monitor the darwin machines in prometheus | 14:43:41 |
Lily Foster | In reply to @hexa:lossy.network don't think we monitor the darwin machines in prometheus We do, but I bet they don't push that metric | 14:45:15 |
hexa | https://ofborg.org/prometheus/graph?g0.expr=node_os_info&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h | 14:45:36 |
hexa | I don't think we do π | 14:46:07 |
Lily Foster | Yeah looks like they are only sending ofborg metrics | 14:46:19 |
Lily Foster | No system metrics | 14:46:23 |
hexa | oh | 14:46:30 |
hexa | where did you find a darwin metric? | 14:46:51 |
Lily Foster | Actually the metrics are only sent from the central queue manager. So yeah they aren't sending anything | 14:47:18 |
Lily Foster | (for darwin build info) | 14:47:25 |
Lily Foster | You're right | 14:47:29 |
| pbsds joined the room. | 14:49:17 |
cole-h | In reply to @cafkafk:gitter.im disk full https://github.com/NixOS/nixpkgs/pull/258395/checks?check_run_id=17292775692 Should be fixed now. | 14:56:40 |
Lily Foster | Any chance we could run the prometheus agent on darwin cole-h? π₯Ί | 14:57:27 |
cole-h | We do, I guess they're just not scraped | 14:58:43 |
cole-h | curl -ss http://...:9100/metrics | head
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 3.9458e-05
go_gc_duration_seconds{quantile="0.25"} 5.4583e-05
go_gc_duration_seconds{quantile="0.5"} 6.6707e-05
go_gc_duration_seconds{quantile="0.75"} 9.1292e-05
go_gc_duration_seconds{quantile="1"} 0.000423
go_gc_duration_seconds_sum 251.971410887
go_gc_duration_seconds_count 115795
# HELP go_goroutines Number of goroutines that currently exist.
| 15:00:39 |
cole-h | OK, they're scraped now. | 15:14:17 |
hexa | the node_filesystem_free metric that is used in the alerting rule does not exist | 15:31:28 |
hexa |  Download image.png | 15:31:52 |
hexa | this one | 15:31:53 |
cole-h | π₯πΆπ₯ | 15:32:09 |
hexa | should probably use avail_bytes π | 15:32:17 |
hexa | or free_bytes | 15:32:34 |
hexa | heck, what is the difference? | 15:32:40 |