| 12 Dec 2024 |
rhelmot | I have the setting at 3 rn | 02:39:14 |
rhelmot | the question I originally stated was "is it valid to have a gcroot which is a normal file and not a symlink" | 02:40:34 |
rhelmot | * the question I originally stated maybe should have been "is it valid to have a gcroot which is a normal file and not a symlink" | 02:40:47 |
| noisersup changed their profile picture. | 20:04:35 |
| 13 Dec 2024 |
Christian Theune | hexa: quick update from our end regarding hydra improvements we've planned/queued up (and which mostly ma27 will implement). there's a pr currently waiting to be merged to support globbing in named constituents that also adds logging for memory usage per job. we're currently switching our hydra to zstd compression and will then switch to introducing a hard memory limit. after that i think we'll take a look at moving compression to the workers. I've had one idea to look into making the compression an option on the jobset to allow more gradual changes if this is touched again in the future, but it seems like that might be complicated bordering fragile. and then we'll also take a look at some corner cases where stuck jobs need to be manually cancelled or even killed on the builder itself. as things are currently preparing for the holidays, it looks like most of that will likely happen in the new year. | 06:20:00 |
vcunat | Sounds nice. At hydra.nixos.org we seem to have now avoided the compression bottleneck by brute force (48-core EPYC + hyperthreading). | 07:28:47 |
Christian Theune | Yeah, our immediate measurement was a reduction of (disk / channel) image compression time going down from 5 minutes to 7s, so that seems a big win. | 08:03:44 |
Christian Theune | Nevertheless I'm trying to keep an eye on things that happen in a blocking fashion in the queue runner. | 08:04:28 |
Christian Theune | because those will all be bottlenecks for scaling | 08:04:49 |
Christian Theune | i'm guessing s3 uploads being in a similar spot | 08:05:05 |
Christian Theune | if s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen form the workers as long as hydra provides the signature. | 08:06:08 |
Christian Theune | * if s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen from the workers as long as hydra provides the signature. | 08:06:18 |
Christian Theune | from a security perspective i understand that we want to keep the signing key on the master | 08:06:29 |
Christian Theune | s3 upload credentials then aren't really /that/ sensitive compared to that we have to trust the content that the builders generate anyway. | 08:06:49 |
vcunat | Signing itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR. | 08:07:09 |
vcunat | * Signing itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR. (in principle) | 08:07:15 |
Christian Theune | ah, interesting. | 08:07:28 |
Christian Theune | that would mean we wouldn't even have to transfer the files to the master for that reason. | 08:07:43 |
Christian Theune | and the builder already has the closure and could upload | 08:07:53 |
Christian Theune | i'll keep that in mind when we take a look at moving the compression around | 08:08:04 |
vcunat | Yes, that does sound like good architecture. | 08:08:17 |
Christian Theune | not sure whether it's good. it seems better than what it is now. 😉 | 08:08:37 |
Christian Theune | but yeah | 08:08:39 |
vcunat | Though hydra.nixos.org is now blocked by loading jobs from DB. Probably the steps that check what's in S3 already. (it's overseas unfortunately so higher latency) | 08:08:54 |
Christian Theune | yeah i've read that. that part of the code/architecture i haven't looked at before and it's two steps further down the road on our map. | 08:10:14 |
Christian Theune | (our s3 is local and we have a much lower number of jobs anyway) | 08:10:40 |
Christian Theune | but yeah, happy to help in general, but need to be careful with my commitments ... | 08:11:01 |
vcunat | Sure. I appreciate any kind of progress 🙂 | 08:11:52 |
Christian Theune | the martian is always right. one problem at a time. | 08:13:57 |
7c6f434c | If the builder sends just the hash to sign, this is not that far from having the signing key on the builder? | 08:15:32 |