| 25 Jan 2025 |
zowoq | No, the buildbot badge plugin doesn't work with buildbot-nix. | 01:26:37 |
Roberto Abdelkader Martínez Pérez | Ah, got it, thanks for the info! | 01:34:11 |
Roberto Abdelkader Martínez Pérez | Sorry to ask again, but I’d like to enable Mergify for my repository and I'm not sure if it's already enabled organization-wide. Do I just need to add a .mergify.yml file to the root of my repo, or do the admins need to perform any additional setup? | 02:19:32 |
zowoq | Only needs to be approved by an admin, it doesn't require any additional setup. We can't enable it org wide as it adds status checks to every repo. | 02:28:06 |
Roberto Abdelkader Martínez Pérez | zowoq: Could you please enable Mergify for my repository autofirma-nix? Thanks! | 02:36:01 |
zowoq | Done. | 02:37:40 |
Roberto Abdelkader Martínez Pérez |  Download image.png | 03:04:08 |
Roberto Abdelkader Martínez Pérez | zowoq: Thanks for enabling Mergify. However, Merge Queue and Workflow Automation are still disabled (see attached screenshot). Could you please enable both? | 03:04:33 |
zowoq | Enabled. Didn't know those were options and that they aren't the default. | 04:12:01 |
Roberto Abdelkader Martínez Pérez | Thank you! | 10:50:57 |
SomeoneSerge (back on matrix) | Yes, although I'm now reflecting whether runtime tests (which we need the GPU instance for) should be the first priority if the goal is to ensure stability of GPU-accelerated software. Now that there's a community-funded hydra we can be sure to observe build failures retrospectively, and the public substituter helps with iteration times - it's a big difference compared to the previous state of affairs. Adding the GPU instance, while sounds very cool, would further increase visibility but only for discovering breakages that already happened. Arguably, we'd get a much bigger impact by focusing on integration with the forge now. There are two ways we could integrate with the forge: channel-blocking and OfBorg-like. The former is mostly about working hours: we maintain our own channel, or we enhance the logic that advances the channels so that instead of being triggered by one jobset in the official hydra it could also get a report from an external source, namely the community hydra. The latter, like runtime tests, is about resources: we'd need a CI that can react to on-push events in Nixpkgs to evaluate and build stuff with {cuda,rocm}Support, and we'd need a github action to fetch the report. We don't even need to block PRs, we just need a linting feature that would inform authors that their change also affects the GPU variants of nixpkgs and that they could maybe ping the responsible team | 14:53:46 |
SomeoneSerge (back on matrix) | I imagine that on-push jobs would be a lot more pressure, but as long as we can reasonably argue that this is the right platform to build this kind of CI I think there are a few more sources we can attract to the opencollective | 14:59:23 |
| connor (burnt/out) (UTC-8) joined the room. | 15:32:46 |
| @luxzi:matrix.org changed their display name from luxzi (they/she) to luxzi (she/they). | 20:01:16 |
| 26 Jan 2025 |
emily | Gaétan Lepage: load average on the 10-core Darwin builder is 21.58, trying to fix stdenv | 21:17:55 |
emily | Ihar Hrachyshka: are you running any builds? | 21:21:48 |
Ihar Hrachyshka | emily: I do investigate the llama-cpp-python issue, yes. any issue with that? | 21:23:01 |
emily | the box's load average is ~2× the number of cores so it's pretty overloaded right now. I was going to fire off builds to test fixes for Darwin stdenv for the next staging-next cycle (due in ~2 days), and to try and reproduce the treewide Rust FOD hash replacement and check for any Darwin-specific issues, but it's struggling even as it is | 21:24:09 |
emily | might be good to lower cores/max-jobs though I don't know if it's already overloaded from other builds | 21:24:55 |
Ihar Hrachyshka | emily: ok sorry, I'm obtuse :) you want me to hold off for now? that's fine, I can do something else. | 21:24:59 |
emily | could you just try with lower cores/max-jobs settings maybe? | 21:25:21 |
emily | the default is 10 cores + 10 jobs, which can mean up to ~100 threads on a 10-core processor | 21:25:39 |
emily | the stdenv build will take hours anyway so it's not urgent, but I don't want to throw more jobs at an already overloaded machine | 21:26:08 |
Ihar Hrachyshka | it's 1 job for me. how do I limit the cores? is there a universal recipe or I patch cmake files? | 21:26:25 |
emily | it's --option cores and --option max-jobs in Nix (-j for short on the latter) | 21:28:41 |
emily | that said there are like 6 Python processes using up a huge amount of CPU so if you're only building one package I'm not sure it's the problem :) | 21:29:21 |
Ihar Hrachyshka | I think I know which package it is lol | 21:31:43 |
Ihar Hrachyshka | jax pretty sure | 21:31:48 |
Ihar Hrachyshka | you should probably ask Gaétan Lepage I believe he was looking at this derivation lately and we had some convos around the memory hogging / test timeouts there | 21:32:27 |
emily | ideally the builder would not be unusably overloaded every couple days :( | 21:33:28 |