!zghijEASpYQWYFzriI:nixos.org

Hydra

371 Members
109 Servers

Load older messages


SenderMessageTime
12 Dec 2024
@rhelmot:matrix.orgrhelmotI have the setting at 3 rn02:39:14
@rhelmot:matrix.orgrhelmotthe question I originally stated was "is it valid to have a gcroot which is a normal file and not a symlink"02:40:34
@rhelmot:matrix.orgrhelmot * the question I originally stated maybe should have been "is it valid to have a gcroot which is a normal file and not a symlink"02:40:47
@noisersup:matrix.orgnoisersup changed their profile picture.20:04:35
13 Dec 2024
@ctheune:matrix.flyingcircus.ioChristian Theune hexa: quick update from our end regarding hydra improvements we've planned/queued up (and which mostly ma27 will implement). there's a pr currently waiting to be merged to support globbing in named constituents that also adds logging for memory usage per job. we're currently switching our hydra to zstd compression and will then switch to introducing a hard memory limit. after that i think we'll take a look at moving compression to the workers. I've had one idea to look into making the compression an option on the jobset to allow more gradual changes if this is touched again in the future, but it seems like that might be complicated bordering fragile. and then we'll also take a look at some corner cases where stuck jobs need to be manually cancelled or even killed on the builder itself. as things are currently preparing for the holidays, it looks like most of that will likely happen in the new year. 06:20:00
@vcunat:matrix.orgvcunatSounds nice. At hydra.nixos.org we seem to have now avoided the compression bottleneck by brute force (48-core EPYC + hyperthreading).07:28:47
@ctheune:matrix.flyingcircus.ioChristian TheuneYeah, our immediate measurement was a reduction of (disk / channel) image compression time going down from 5 minutes to 7s, so that seems a big win.08:03:44
@ctheune:matrix.flyingcircus.ioChristian TheuneNevertheless I'm trying to keep an eye on things that happen in a blocking fashion in the queue runner.08:04:28
@ctheune:matrix.flyingcircus.ioChristian Theunebecause those will all be bottlenecks for scaling08:04:49
@ctheune:matrix.flyingcircus.ioChristian Theunei'm guessing s3 uploads being in a similar spot08:05:05
@ctheune:matrix.flyingcircus.ioChristian Theuneif s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen form the workers as long as hydra provides the signature. 08:06:08
@ctheune:matrix.flyingcircus.ioChristian Theune * if s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen from the workers as long as hydra provides the signature. 08:06:18
@ctheune:matrix.flyingcircus.ioChristian Theunefrom a security perspective i understand that we want to keep the signing key on the master08:06:29
@ctheune:matrix.flyingcircus.ioChristian Theunes3 upload credentials then aren't really /that/ sensitive compared to that we have to trust the content that the builders generate anyway.08:06:49
@vcunat:matrix.orgvcunatSigning itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR.08:07:09
@vcunat:matrix.orgvcunat * Signing itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR. (in principle)08:07:15
@ctheune:matrix.flyingcircus.ioChristian Theuneah, interesting. 08:07:28
@ctheune:matrix.flyingcircus.ioChristian Theunethat would mean we wouldn't even have to transfer the files to the master for that reason.08:07:43
@ctheune:matrix.flyingcircus.ioChristian Theuneand the builder already has the closure and could upload08:07:53
@ctheune:matrix.flyingcircus.ioChristian Theunei'll keep that in mind when we take a look at moving the compression around08:08:04
@vcunat:matrix.orgvcunatYes, that does sound like good architecture.08:08:17
@ctheune:matrix.flyingcircus.ioChristian Theunenot sure whether it's good. it seems better than what it is now. 😉08:08:37
@ctheune:matrix.flyingcircus.ioChristian Theunebut yeah08:08:39
@vcunat:matrix.orgvcunatThough hydra.nixos.org is now blocked by loading jobs from DB. Probably the steps that check what's in S3 already. (it's overseas unfortunately so higher latency)08:08:54
@ctheune:matrix.flyingcircus.ioChristian Theuneyeah i've read that. that part of the code/architecture i haven't looked at before and it's two steps further down the road on our map.08:10:14
@ctheune:matrix.flyingcircus.ioChristian Theune(our s3 is local and we have a much lower number of jobs anyway)08:10:40
@ctheune:matrix.flyingcircus.ioChristian Theunebut yeah, happy to help in general, but need to be careful with my commitments ... 08:11:01
@vcunat:matrix.orgvcunatSure. I appreciate any kind of progress 🙂08:11:52
@ctheune:matrix.flyingcircus.ioChristian Theunethe martian is always right. one problem at a time. 08:13:57
@7c6f434c:nitro.chat7c6f434c If the builder sends just the hash to sign, this is not that far from having the signing key on the builder? 08:15:32

Show newer messages


Back to Room ListRoom Version: 6