Hydra - Public Room Timeline

	Hydra	372 Members
		109 Servers

Load older messages

Sender	Message	Time
23 Sep 2025
Sandro	not sure how to provide better debugging things Hydra 0-unstable-2025-09-13 (using nix-2.29.2+4 and nix-eval-jobs-2.31.0). You are signed in as sandro. `➜ nix --version nix (Nix) 2.30.3+4`	07:25:43
shawn8901	Hi, I have a short question where I did not find a good solution yet and maybe someone here had a similar "issue". I am running my own hydra instance to ci check my system configs (defined as flake) and update the flake lock file. I'm archive that so far by defining an `releaseTools.aggregate` and defining a runCommand for the aggregate job. That works like a charm. As the flake is a multi arch and I have another (more) powerful machine I have defined some remote builders (which also works fine). But sometimes the aggregate job is scheduled to one of the remote builders, thus receiving all the closures of all hosts (as far as I can tell). Is there a way to pin the aggregate to a specific builder (best case the hydra instance?) I tried preferLocal which seems to have no effect and a tried to pass custom systemFeatures (so that just the hydra instance should be capable of running the job), but that does not work as aggregate does not accept required systemFeatures. Does someone maybe have an idea to pin it?	08:42:51
	kenji changed their display name from a-kenji to kenji.	10:41:21
Sandro	Just tried the following combination which I think worked earlier but no longer does... Hydra 0-unstable-2025-09-13 (using nix-2.29.2+4 and nix-eval-jobs-2.30.0). You are signed in as sandro.	13:29:52
Sandro	and still the same error	13:31:26
John Ericson	@sandro:supersandro.de: ah ok thanks that's very interesting	22:26:10
John Ericson	I'll try to upgrade nix and nix eval jobs separately also	22:26:22
24 Sep 2025
	cransom left the room.	15:27:19
26 Sep 2025
vcunat	Hmm, weird. This command stopped working recently `curl -H "Accept: application/json" https://hydra.nixos.org/queue-runner-status`	07:03:25
vcunat	But without the accept header it does still fetch the html, but it's just harder to work with.	07:03:56
Sandro	Seems to work for me 🤔	17:40:00
27 Sep 2025
vcunat	Now it works for me as well.	06:55:53
28 Sep 2025
John Ericson	@joerg:thalheim.io: so I think I know how I want to debug it:	12:17:19
John Ericson	1. Separate debug output, we just have that on Nix and keep it working in CI	12:17:42
John Ericson	2. Make the builder sleep for a long time	12:17:55
John Ericson	Attach debugger to nix while builder is sleeping, then trace to the part where it kills the builder	12:18:56
John Ericson	* 3. Attach debugger to nix while builder is sleeping, then trace to the part where it kills the builder	12:19:06
Mic92	Is nix a separate process in this case?	16:08:22
Mic92	There are also tricks where a function spawns gdb to itself. Also i would use gdbstub here	16:09:16
vcunat	At a quick glance, maybe you'd consider to use rr to record an execution and then inspect that recording (which happens in a gdb interface again).	17:16:52
vcunat	* At a quick glance, maybe you'd consider using rr to record an execution and then inspect that recording (which happens in a gdb interface again).	17:17:07
John Ericson	In reply to @joerg:thalheim.io Is nix a separate process in this case? With the ssh://localhost yes	23:11:04
John Ericson	Could use ?remote-program=... to run GDB hah	23:11:49
John Ericson	In reply to @vcunat:matrix.org At a quick glance, maybe you'd consider using rr to record an execution and then inspect that recording (which happens in a gdb interface again). Yeah that might work well with the remote program thing	23:12:15
29 Sep 2025
Mic92	John Ericson: we need to make the call soon. I would prefer to get rid of nix 2.29 in nixpkgs soon to not have to maintain it for another release.	06:27:25
30 Sep 2025
vcunat	Apparently it's flaky. Sometimes I get `{"status":"unknown"}`, sometimes a proper reply.	09:13:26
Janne	`run3(['hydra-queue-runner', '--status'], \undef, \$stdout, \$stderr); my $status; if ($? != 0) { $status = { status => "unknown" }; } else { $status = decode_json($stdout); }`	10:58:12
Janne	that happens when we cannot execute the status command	10:58:18
Janne	should be fixed with the new queue runner when we expose the metrics directly rather than going through catalyst. that would require a specific proxyPass in nginx but should simplify everything and make it more stable	10:59:35
vcunat	🤔 so now I tried to run the command manually on that particular machine, repeatedly. Sometimes after the correct-looking JSON it added `error: queue runner did not respond; status information may be wrong` and exited with status 1.	11:06:41

Show newer messages

Back to Room ListRoom Version: 6