Hydra - Public Room Timeline

	Hydra	371 Members
		109 Servers

Load older messages

Sender	Message	Time
23 Feb 2025
John Ericson	@hexa:lossy.network: thanks Hexa! That's just the one I thought it would be	07:07:02
24 Feb 2025
hacker1024	We have an x86_64 machine running Hydra and an aarch64 builder. On recent versions of Nix/Hydra (I've tried the one with Nix 2.25 pre-`LegacySSHStore` and Nix 2.26 post-`LegacySSHStore`, it looks like x86_64 jobs that depend on outputs built on the aarch64 machine (e.g. deployment scripts that use aarch64 system closures) are getting stuck indefinitely on "Sending inputs", and cannot even be cancelled. Pure x86_64 and aarch64 jobs are still fine. Has anyone had this too?	02:54:28
hacker1024	* We have an x86_64 machine running Hydra and an aarch64 builder. On recent versions of Nix/Hydra (I've tried the one with Nix 2.25 pre-`LegacySSHStore` and Nix 2.26 post-`LegacySSHStore`), it looks like x86_64 jobs that depend on outputs built on the aarch64 machine (e.g. deployment scripts that use aarch64 system closures) are getting stuck indefinitely on "Sending inputs", and cannot even be cancelled. Pure x86_64 and aarch64 jobs are still fine. Has anyone had this too?	02:54:40
hacker1024	* We have an x86_64 machine running Hydra and an aarch64 builder. On recent versions of Nix/Hydra (I've tried the one with Nix 2.25 pre-`LegacySSHStore` and Nix 2.26 post-`LegacySSHStore`), it looks like x86_64 jobs that depend on outputs built on the aarch64 machine (e.g. deployment scripts that use aarch64 system closures) are getting stuck indefinitely on "Sending inputs", and cannot even be cancelled. Nothing significant seems to be getting logged on either machine. Pure x86_64 and aarch64 jobs are still fine. Has anyone had this too?	02:54:57
hacker1024	* We have an x86_64 machine running Hydra and an aarch64 builder. On recent versions of Nix/Hydra (I've tried the one with Nix 2.25 pre-`LegacySSHStore` and Nix 2.26 post-`LegacySSHStore`), it looks like x86_64 jobs that depend on outputs built on the aarch64 machine (e.g. deployment scripts that use aarch64 system closures) are getting stuck indefinitely on "Sending inputs", and cannot even be cancelled. Nothing significant seems to be getting logged on either machine. Pure x86_64 and aarch64 jobs are still fine. Has anyone had this too? Edit: Made issue	05:21:48
shawn8901	Hi, since some time i am having the following error on my hydra instance. Everytime it does an evaluation it aborts with the following error in log: Feb 24 17:00:38 tank nix-daemon[1579]: accepted connection from pid 6217, user hydra Feb 24 17:00:38 tank hydra-evaluator[2106]: (config:.jobsets) Evaluating... Feb 24 17:00:38 tank hydra-evaluator[2106]: error: stoi Feb 24 17:00:38 tank hydra-evaluator[2106]: {UNKNOWN}: process ended prematurely at /nix/store/kvnp4qdk6bcg9j0pc8d87dgz6z5qklhl-hydra-0-unstable-2025-02-18/bin/.hydra-eval-jobset-wrapped line 404. at /nix/store/i85ni9bphygj6d31v68x24ncvhbc2vn6-hydra-perl-deps/lib/perl5/site_perl/5.40.0/Catalyst/Model/DBIC/Schema.pm line 526 Feb 24 17:00:38 tank hydra-evaluator[2078]: evaluation of jobset ‘config:.jobsets (jobset#1)’ failed with exit code 1 I am kinda out of ideas, the webserver runs fine. I did play around a bit with the initd system (switched from scripted to systemd), if i remember correctly its kinda at the same time frame. I noticed that there was a wrongly mapped uid for another service (which i fixed with chowning to the new id), but for hydra i did not find similar. I also tried to reinstall hydra (nuke /var/lib/hydra and drop the database). But all of that did not help. I found an old issue relating to memory on that error, tho it confuses me as the machine has plenty of memory left unused and was capable to run my hydra builds before. Has anyone an idea for me how to continue analying that issue?	16:19:01
shawn8901	* Hi, since some time i am having the following error on my hydra instance. Everytime it does an evaluation it aborts with the following error in log: Feb 24 17:00:38 tank nix-daemon[1579]: accepted connection from pid 6217, user hydra Feb 24 17:00:38 tank hydra-evaluator[2106]: (config:.jobsets) Evaluating... Feb 24 17:00:38 tank hydra-evaluator[2106]: error: stoi Feb 24 17:00:38 tank hydra-evaluator[2106]: {UNKNOWN}: process ended prematurely at /nix/store/kvnp4qdk6bcg9j0pc8d87dgz6z5qklhl-hydra-0-unstable-2025-02-18/bin/.hydra-eval-jobset-wrapped line 404. at /nix/store/i85ni9bphygj6d31v68x24ncvhbc2vn6-hydra-perl-deps/lib/perl5/site_perl/5.40.0/Catalyst/Model/DBIC/Schema.pm line 526 Feb 24 17:00:38 tank hydra-evaluator[2078]: evaluation of jobset ‘config:.jobsets (jobset#1)’ failed with exit code 1 I am kinda out of ideas, the webserver runs fine. I did play around a bit with the initd system (switched from scripted to systemd), if i remember correctly its kinda at the same time frame. I noticed that there was a wrongly mapped uid for another service (which i fixed with chowning to the new id), but for hydra i did not find similar. I also tried to reinstall hydra (nuke /var/lib/hydra and drop the database). But all of that did not help. I found an old issue relating to memory on that error, tho it confuses me as the machine has plenty of memory left unused and was capable to run my hydra builds before. Has anyone an idea for me how to continue analying that issue? edit: I found https://github.com/NixOS/hydra/issues/1437 that could be similar thing, at least the time range could fit, tho i am seing a different error.	16:37:20
shawn8901	* Hi, since some time i am having the following error on my hydra instance. Everytime it does an evaluation it aborts with the following error in log: Feb 24 17:00:38 tank nix-daemon[1579]: accepted connection from pid 6217, user hydra Feb 24 17:00:38 tank hydra-evaluator[2106]: (config:.jobsets) Evaluating... Feb 24 17:00:38 tank hydra-evaluator[2106]: error: stoi Feb 24 17:00:38 tank hydra-evaluator[2106]: {UNKNOWN}: process ended prematurely at /nix/store/kvnp4qdk6bcg9j0pc8d87dgz6z5qklhl-hydra-0-unstable-2025-02-18/bin/.hydra-eval-jobset-wrapped line 404. at /nix/store/i85ni9bphygj6d31v68x24ncvhbc2vn6-hydra-perl-deps/lib/perl5/site_perl/5.40.0/Catalyst/Model/DBIC/Schema.pm line 526 Feb 24 17:00:38 tank hydra-evaluator[2078]: evaluation of jobset ‘config:.jobsets (jobset#1)’ failed with exit code 1 I am kinda out of ideas, the webserver runs fine. I did play around a bit with the initd system (switched from scripted to systemd), if i remember correctly its kinda at the same time frame. I noticed that there was a wrongly mapped uid for another service (which i fixed with chowning to the new id), but for hydra i did not find similar. I also tried to reinstall hydra (nuke /var/lib/hydra and drop the database). But all of that did not help. I found an old issue relating to memory on that error, tho it confuses me as the machine has plenty of memory left unused and was capable to run my hydra builds before. Has anyone an idea for me how to continue analying that issue? edit: I found https://github.com/NixOS/hydra/issues/1437 that could be similar thing, at least the time range could fit, tho i am seing a different error text.	16:37:27
shawn8901	Okay, its locatable to the latest hydra bump, when I revert it, it's working again fine. Should I create an issue in nixpks or more on hydras github?	20:18:34
25 Feb 2025
hacker1024	There are only two places where `nix-eval-jobs` uses `stoi`. Have you set `evaluator_workers` or `evaluator_max_memory_size` in your Hydra configuration?	02:14:18
shawn8901	In reply to @hacker1024:matrix.org There are only two places where `nix-eval-jobs` uses `stoi`. Have you set `evaluator_workers` or `evaluator_max_memory_size` in your Hydra configuration? Yeah, I am limiting that. So `stoi` = out of memory?	06:25:39
hacker1024	No, `stoi` is a function that parses an integer. What is the exact contents of that part of your config?	06:26:39
shawn8901	`evaluator_max_memory_size = ${toString (4 * 1024 * 1024 * 1024)}` which was totally fine previously	06:27:35
shawn8901	And I am setting `evaluator_workers = 2`	06:28:15
hacker1024	Hmm yeah that does seem fine	06:28:30
hacker1024	Still maybe try without it for a bit and see if that helps?	06:28:42
hacker1024	Wait that size is in mb though? You're allocating 4EB	06:29:20
hacker1024	* Wait that size is in mb though? You're allocating 4PB	06:29:39
hacker1024	That's also 2x the signed integer limit	06:30:53
shawn8901	Is it? It should be 4g, at least that was where it limited before	06:31:04
shawn8901	Yeah maybe that is then where it's choking	06:31:31
shawn8901	I'll try it out when I am back at home, I just did not expect to break in such a way between a bump of Less then 2 weeks 😅	06:32:14
hacker1024	The unit is definitely mb now, and I'm pretty sure it always has been 👀 https://github.com/nix-community/nix-eval-jobs/blob/4b392b284877d203ae262e16af269f702df036bc/src/eval-args.cc#L59	06:32:31
hacker1024	So you'd just want 4096	06:32:41
shawn8901	Hum, the. Old hydra did not cry about it	06:32:56
shawn8901	The value is unchanged for 8month in my config	06:33:27
shawn8901	I'll check that out and report later, thank you very much for your help!	06:33:49
hacker1024	No problem	06:33:56
shawn8901	Okay I guess I found where my initial confused comes from. There are old log outputs, where those has been printed out in bytes, and few years ago evaluator max heap size on hydra was also set in bytes (as the envvar that was passed did understand that), and I just blindly assumed that it's just then bytes 🫣 old hydra then possibly did just use the default (which is also 4g) and that's likely why it matched my observations	06:54:43
26 Feb 2025
shawn8901	It was that issue at the end. New version works fine now. 🎉	04:49:57

Show newer messages

Back to Room ListRoom Version: 6