Hydra | 372 Members | |
| 109 Servers |
| Sender | Message | Time |
|---|---|---|
| 23 Sep 2025 | ||
| not sure how to provide better debugging things
| 07:25:43 | |
| Hi, I have a short question where I did not find a good solution yet and maybe someone here had a similar "issue". I am running my own hydra instance to ci check my system configs (defined as flake) and update the flake lock file. I'm archive that so far by defining an releaseTools.aggregate and defining a runCommand for the aggregate job. That works like a charm. As the flake is a multi arch and I have another (more) powerful machine I have defined some remote builders (which also works fine).But sometimes the aggregate job is scheduled to one of the remote builders, thus receiving all the closures of all hosts (as far as I can tell). Is there a way to pin the aggregate to a specific builder (best case the hydra instance?) I tried preferLocal which seems to have no effect and a tried to pass custom systemFeatures (so that just the hydra instance should be capable of running the job), but that does not work as aggregate does not accept required systemFeatures. Does someone maybe have an idea to pin it? | 08:42:51 | |
| 10:41:21 | ||
| Just tried the following combination which I think worked earlier but no longer does...
| 13:29:52 | |
| and still the same error | 13:31:26 | |
| @sandro:supersandro.de: ah ok thanks that's very interesting | 22:26:10 | |
| I'll try to upgrade nix and nix eval jobs separately also | 22:26:22 | |
| 24 Sep 2025 | ||
| 15:27:19 | ||
| 26 Sep 2025 | ||
| Hmm, weird. This command stopped working recently
| 07:03:25 | |
| But without the accept header it does still fetch the html, but it's just harder to work with. | 07:03:56 | |
| Seems to work for me 🤔 | 17:40:00 | |
| 27 Sep 2025 | ||
| Now it works for me as well. | 06:55:53 | |
| 28 Sep 2025 | ||
| @joerg:thalheim.io: so I think I know how I want to debug it: | 12:17:19 | |
| 1. Separate debug output, we just have that on Nix and keep it working in CI | 12:17:42 | |
| 2. Make the builder sleep for a long time | 12:17:55 | |
| Attach debugger to nix while builder is sleeping, then trace to the part where it kills the builder | 12:18:56 | |
| * 3. Attach debugger to nix while builder is sleeping, then trace to the part where it kills the builder | 12:19:06 | |
| Is nix a separate process in this case? | 16:08:22 | |
| There are also tricks where a function spawns gdb to itself. Also i would use gdbstub here | 16:09:16 | |
| At a quick glance, maybe you'd consider to use rr to record an execution and then inspect that recording (which happens in a gdb interface again). | 17:16:52 | |
| * At a quick glance, maybe you'd consider using rr to record an execution and then inspect that recording (which happens in a gdb interface again). | 17:17:07 | |
In reply to @joerg:thalheim.ioWith the ssh://localhost yes | 23:11:04 | |
| Could use ?remote-program=... to run GDB hah | 23:11:49 | |
In reply to @vcunat:matrix.orgYeah that might work well with the remote program thing | 23:12:15 | |
| 29 Sep 2025 | ||
| John Ericson: we need to make the call soon. I would prefer to get rid of nix 2.29 in nixpkgs soon to not have to maintain it for another release. | 06:27:25 | |
| 30 Sep 2025 | ||
Apparently it's flaky. Sometimes I get {"status":"unknown"}, sometimes a proper reply. | 09:13:26 | |
| 10:58:12 | |
| that happens when we cannot execute the status command | 10:58:18 | |
| should be fixed with the new queue runner when we expose the metrics directly rather than going through catalyst. that would require a specific proxyPass in nginx but should simplify everything and make it more stable | 10:59:35 | |
| 🤔 so now I tried to run the command manually on that particular machine, repeatedly. Sometimes after the correct-looking JSON it added
and exited with status 1. | 11:06:41 | |