OfBorg | 156 Members | |
| Number of builds and evals in queue: <TBD> | 57 Servers |
| Sender | Message | Time |
|---|---|---|
| 29 Sep 2023 | ||
In reply to @7c6f434c:nitro.chat lib.isPureEval | 14:33:37 | |
| * `lib.inPureEvalMode` | 14:34:49 | |
| Fortunately existence of stable branches would make tying to details of ofBorg deployments just too painful and impractical… | 14:38:34 | |
| Stables branches or not, it's a bad and frail idea | 14:41:41 | |
| Please don't seriously consider this in any capacity.... | 14:42:19 | |
| My problem with increasing the ofborg timeout on darwin is that the darwin builders are pretty slow as it is. I worry that that would cause the darwin queue to blow up (as has happened in the past even without a longer timeout). It would be interesting to explore an "ofborgWillTimeoutOnTheseSystems" predicate, though. | 15:40:09 | |
| cole-h: What about a dynamic approach: When the queue is too long, time out the longest-running job until it's short enough again | 15:58:15 | |
could we add a command to ofborg, so that one could do @ofborg set timeout 2h or something, on a case by case basis, as a github comment? this would at least be a stopgap, and would mean we don't have to "pollute" meta with ofborg-specific attributes | 15:59:46 | |
* could we add another command to ofborg, so that one could do @ofborg set timeout 2h or something, on a case by case basis, as a github comment? this would at least be a stopgap, and would mean we don't have to "pollute" meta with ofborg-specific attributes | 15:59:54 | |
* could we add another command to ofborg, so that one could do @ofborg set timeout 2h aarch64-darwin or something, on a case by case basis, as a github comment? this would at least be a stopgap, and would mean we don't have to "pollute" meta with ofborg-specific attributes | 16:00:06 | |
* could we add another command to ofborg, so that one could do @ofborg set timeout 2h aarch64-darwin or something, on a case by case basis, as a github comment on the pr of a specific package? this would at least be a stopgap, and would mean we don't have to "pollute" meta with ofborg-specific attributes | 16:00:24 | |
* could we add another command to ofborg, so that one could do @ofborg set timeout 2h polkadot aarch64-darwin or something, on a case by case basis, as a github comment on the pr of a specific package? this would at least be a stopgap, and would mean we don't have to "pollute" meta with ofborg-specific attributes | 16:00:31 | |
In reply to @infinisil:matrix.orgAs far as I know, RabbitMQ (what ofborg uses) is a "dumb" queue system. I don't know if we get information about the queue aside from the fact that there's a job we can take, and then communicating that we succeeded a job... (I'm not all that familiar with RabbitMQ, however) | 16:01:32 | |
In reply to @asymmetric:matrix.dapp.org.ukAn interesting idea, but I think the problem with that is that 1) the timeout is set in the nix-build command (so there's no way to change it once the build has started running); and 2) the machine that processes comment commands is separate from the machines that run those commands (builds, evals, etc)... | 16:03:22 | |
In reply to @asymmetric:matrix.dapp.org.uk* An interesting idea, but I think the problem with that is that 1) the timeout is set for the nix-build command (so there's no way to change it once the build has started running); and 2) the machine that processes comment commands is separate from the machines that run those commands (builds, evals, etc)... | 16:03:38 | |
In reply to @cole-h:matrix.org yeah i thought that we could either:
i'm not sure i understand the implications of your point 2) though -- couldn't the comments-listening machine direct the building-machine? | 16:06:12 | |
| My point with #2 is that (as far as I know), the coordinator doesn't know which machine took the job, so how should it determine which build machine to inform about the change? (I think the simplest depiction of how it all works, from my knowledge, is GitHub webhook -> ofborg-core (coordinator) -> AMQP server <-> ofborg-eval-X (evaluator and builder)) But the "comment in PR body" could work, if ofborg is notified about that information (it must be, in some roundabout way, because we have access to the PR number)...... | 16:10:11 | |
In reply to @cole-h:matrix.org ofborg-core being the php code running php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:05 | |
In reply to @cole-h:matrix.org* ofborg-core being the php code under ./php? as that seems to be the thing that sits between hooks and rabbitmq | 19:05:17 | |
Not solely that. The core machine also runs most of these binaries (that aren't build or eval related): https://github.com/NixOS/ofborg/tree/released/ofborg/src/bin | 19:06:39 | |
| 1 Oct 2023 | ||
| 14:39:06 | ||
| disk full https://github.com/NixOS/nixpkgs/pull/258395/checks?check_run_id=17292775692 | 14:39:14 | |
| https://ofborg.org/prometheus/alerts hmm | 14:41:55 | |
| don't think we monitor the darwin machines in prometheus | 14:43:41 | |
In reply to @hexa:lossy.networkWe do, but I bet they don't push that metric | 14:45:15 | |
| https://ofborg.org/prometheus/graph?g0.expr=node_os_info&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h | 14:45:36 | |
| I don't think we do 😄 | 14:46:07 | |
| Yeah looks like they are only sending ofborg metrics | 14:46:19 | |
| No system metrics | 14:46:23 | |
| oh | 14:46:30 | |