13 Dec 2024 |
vcunat | Sounds nice. At hydra.nixos.org we seem to have now avoided the compression bottleneck by brute force (48-core EPYC + hyperthreading). | 07:28:47 |
Christian Theune | Yeah, our immediate measurement was a reduction of (disk / channel) image compression time going down from 5 minutes to 7s, so that seems a big win. | 08:03:44 |
Christian Theune | Nevertheless I'm trying to keep an eye on things that happen in a blocking fashion in the queue runner. | 08:04:28 |
Christian Theune | because those will all be bottlenecks for scaling | 08:04:49 |
Christian Theune | i'm guessing s3 uploads being in a similar spot | 08:05:05 |
Christian Theune | if s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen form the workers as long as hydra provides the signature. | 08:06:08 |
Christian Theune | * if s3 uploads are also in the queue runner blocking things, then I'm wondering whether the uploads could also happen from the workers as long as hydra provides the signature. | 08:06:18 |
Christian Theune | from a security perspective i understand that we want to keep the signing key on the master | 08:06:29 |
Christian Theune | s3 upload credentials then aren't really /that/ sensitive compared to that we have to trust the content that the builders generate anyway. | 08:06:49 |
vcunat | Signing itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR. | 08:07:09 |
vcunat | * Signing itself is cheap, if you provide the hash to sign. The signer doesn't even need the whole NAR. (in principle) | 08:07:15 |
Christian Theune | ah, interesting. | 08:07:28 |
Christian Theune | that would mean we wouldn't even have to transfer the files to the master for that reason. | 08:07:43 |
Christian Theune | and the builder already has the closure and could upload | 08:07:53 |
Christian Theune | i'll keep that in mind when we take a look at moving the compression around | 08:08:04 |
vcunat | Yes, that does sound like good architecture. | 08:08:17 |
Christian Theune | not sure whether it's good. it seems better than what it is now. 😉 | 08:08:37 |
Christian Theune | but yeah | 08:08:39 |
vcunat | Though hydra.nixos.org is now blocked by loading jobs from DB. Probably the steps that check what's in S3 already. (it's overseas unfortunately so higher latency) | 08:08:54 |
Christian Theune | yeah i've read that. that part of the code/architecture i haven't looked at before and it's two steps further down the road on our map. | 08:10:14 |
Christian Theune | (our s3 is local and we have a much lower number of jobs anyway) | 08:10:40 |
Christian Theune | but yeah, happy to help in general, but need to be careful with my commitments ... | 08:11:01 |
vcunat | Sure. I appreciate any kind of progress 🙂 | 08:11:52 |
Christian Theune | the martian is always right. one problem at a time. | 08:13:57 |
7c6f434c | If the builder sends just the hash to sign, this is not that far from having the signing key on the builder? | 08:15:32 |
7c6f434c | (A key that has signed something weird will probably be rotated even if it was not disclosed) | 08:16:09 |
vcunat | Corrupting builds chosen by someone else feels somewhat safer than ability to steal the key. | 08:18:49 |
7c6f434c | Ah right the store path still comes from evaluation on master | 08:22:10 |
vcunat | Though I'm not sure if the builder could inject arbitrary runtime dependencies. | 08:22:48 |
7c6f434c | Well, just forcing the deps to be in the store doesn't sound that much more than just including the payload in all he binaries | 08:23:53 |