18 Jun 2024 |
shawn8901 | * Is there some way with hydra to handle part-time build machines? I want to add my desktop PC as additional remote builder as it's much more powerful, tho it's not running all the time, I tried that in the past but then hydra was failing jobs as the remote builder not being alive. I tried to look at docs but did not find anything in that direction. | 12:14:55 |
cransom | I had a script that would discover alive machines and then add them to the build machines file. | 12:26:38 |
shawn8901 | So the buildMachine file is read before enqueing? | 13:14:51 |
shawn8901 | Ah okay it seems that there is a watcher. Hum maybe that's an idea, I could add it on boot and remove on shutdown or so | 13:18:22 |
cransom | yeah, it's re-read before job distribution iirc. I used it to autoscale builders by checking for machines in an autoscaling group in AWS. | 14:23:49 |
vcunat | Yes, modifying the file takes effect immediately. (for any newly built steps) | 14:40:07 |
20 Jun 2024 |
hexa | I have an issue with hydra where it tells me builders have possibly transient build failures | 11:59:39 |
hexa | Download image.png | 11:59:41 |
hexa | that leads to these 93 jobs not being built | 11:59:53 |
hexa | I can push changes to the tracked branch and the diff of jobs will be built, but these 93 jobs won't | 12:00:18 |
hexa | hydra is not really very verbose, what the issue at hand is | 12:00:35 |
hexa | hydra-queue-runner[301701]: possibly transient failure building ‘/nix/store/ksrvihi2zlm20qaqvxq0ryc4kkhyxjf8-grpcio-tools-1.64.1.tar.gz.drv’ on ‘hexa@build1.darmstadt.ccc.de’:
hydra-queue-runner[301701]: will retry ‘/nix/store/ksrvihi2zlm20qaqvxq0ryc4kkhyxjf8-grpcio-tools-1.64.1.tar.gz.drv’ after 66s
| 12:00:52 |
hexa | example | 12:00:54 |
hexa | * as an example\ | 12:00:59 |
hexa | * as an example | 12:01:01 |
hexa | * I can push changes to the tracked branch and evaluate them, which results in the changed jobs being built, but these 93 jobs simply won't | 12:01:33 |
hexa | hm, these are all fods which have broken urls | 12:04:24 |
vcunat | FODs are hard. Normally hydra won't rebuild them unless the output hash changes. | 12:18:00 |
vcunat | But just changing URL won't change that. | 12:18:12 |
vcunat | * FODs are hard. Normally hydra won't create a new job unless the output hash changes. | 12:18:49 |
vcunat | So you're stuck with the old job tied to the incorrect .drv | 12:19:03 |
vcunat | As an ugly workaround, you could evaluate in a _different) jobset, for example. That will get a new .drv (if you evaluate such commits right away), getting you eventually binaries and then you can just restart these failed jobs to get the cached results and you'll get green. | 12:20:27 |
vcunat | (I know this because sometimes we run into this on hydra.nixos.org) | 12:21:09 |
vcunat | I'm not sure if there's a good solution. Maybe for FODs the equivalence on job creation should be on .drv hash and not output hash. (But we probably want to keep the old behavior for non-FODs.) | 12:23:03 |
vcunat | However, I don't know the codebase at all and I can't perl. | 12:24:40 |
vcunat | Job creation might not suffice, now that I think of it. There's also a mechanism for caching failure from another job, so same there. | 12:30:12 |
vcunat | * Job creation might not suffice, now that I think of it. There's also a mechanism for failure cached from another job, so same there. | 12:30:26 |
hexa | the old ones cleared up now | 13:02:58 |
hexa | I'm not sure if fixing the fods in new evals die that, but that would be surprising to me | 13:03:17 |
21 Jun 2024 |
| @linus:schreibt.jetzt left the room. | 14:05:46 |