| 28 Jan 2025 |
K900 | I don't think this is a good idea | 08:47:50 |
K900 | nix-community is very much outside the security boundary for nixpkgs | 08:48:01 |
K900 | So building anything on that infra opens us up to all kinds of trusting-trust fun | 08:48:33 |
hexa | I can see x86_64-linux:big-parallel jobs running, and quite a lot of others, so I think it is indeed working | 12:18:44 |
Vladimír Čunát | Sadly you can't configure different --cores, but at least something. | 12:19:59 |
hexa | Yeah, that's not possible for remote builds at all sadly | 12:21:03 |
Mic92 | I don't see how this is security relevant. We don't use any builds results for NixOS infra just now and this wouldn't need to change. | 14:05:13 |
Mic92 | * I don't see how this is security relevant. We don't use any builds results of github actions for NixOS infra just now and this wouldn't need to change. | 14:05:24 |
Mic92 | It's just testing. | 14:05:30 |
Mic92 | I think it's a bit expensive having to maintain separate {x86_64,arm64}-{darwin,linux} for the nix and NixOS infra repository just for a few builds every now and than. Not just in terms of maintaining server cost but also people's time. | 14:08:07 |
| Sami Liedes joined the room. | 15:21:30 |
Sami Liedes | Hey. I've been toying with an idea. It started from a foray into all kinds of modern processor trusted enclave features, but ended up at something which I think is much simpler.
Currently I think all the builds happen on centrally managed infrastructure, right? (Does Nix actually have physical servers, or is it some cloud hosted thing?)
I realized that with only a little cleverness it should be feasible to have people securely donate compute time to builds, although in practice I'd limit it to well known cloud instances. (Then there is a policy question of what to trust for official builds.) I don't know if this would be an interesting feature, but I can imagine there have been instances where someone has got a PR merged and would like to pay $1/€1 to get it built sooner, with no downsides that I can see if that comes from extra capacity.
This, I believe, could be done by having a blessed VM image for building and use Googles/Azures/whatever vTPM features to attest that it's a VM with a specific hash, given input with a specific hash (what to build), and that it produced an output with a specific hash. Essentially this would get signed by the cloud vendor. This model trusts the cloud vendor and its security.
Something similar-ish could be done on untrusted hardware using AMD SEV-SNP (modern EPYCs) or Intel TDX (some Xeons), but IIUC serious hardware attacks are still largely outside their threat model. They could be used as an add-on to the cloud attestation if that is desired (albeit running on such hardware probably has an extra cost, and I don't know how much better in practice it is to trust Intel or AMD, especially with a cloud provider that would certainly have the expertise for serious attacks :-. | 15:34:25 |
Sami Liedes | * Hey. I've been toying with an idea. It started from a foray into all kinds of modern processor trusted enclave features, but ended up at something which I think is much simpler.
Currently I think all the builds happen on centrally managed infrastructure, right? (Does Nix actually have physical servers, or is it some cloud hosted thing?)
I realized that with only a little cleverness it should be feasible to have people securely donate compute time to builds, although in practice I'd limit it to well known cloud instances. (Then there is a policy question of what to trust for official builds.) I don't know if this would be an interesting feature, but I can imagine there have been instances where someone has got a PR merged and would like to pay $1/€1 to get it built sooner, with no downsides that I can see if that comes from extra capacity.
This, I believe, could be done by having a blessed VM image for building and use Googles/Azures/whatever vTPM features to attest that it's a VM with a specific hash, given input with a specific hash (what to build), and that it produced an output with a specific hash. Essentially this would get signed by the cloud vendor. This model trusts the cloud vendor and its security.
Something similar-ish could be done on untrusted hardware using AMD SEV-SNP (modern EPYCs) or Intel TDX (some Xeons), but IIUC serious hardware attacks are still largely outside their threat model. They could be used as an add-on to the cloud attestation if that is desired (albeit running on such hardware probably has an extra cost, and I don't know how much better in practice it is to trust Intel or AMD, especially with a cloud provider that would certainly have the expertise for serious attacks :-). | 15:35:08 |
K900 | CPU time is not the bottleneck | 15:37:37 |
K900 | At least most of the time | 15:37:42 |
Sami Liedes | Ah. What is? | 15:37:46 |
K900 | Hydra | 15:37:56 |
Sami Liedes | And Hydra is not CPU-bound? | 15:38:11 |
K900 | Evaluation is very intensive on RAM, especially for NixOS tests | 15:38:16 |
K900 | And the Hydra coordinator machine is often CPU bound | 15:38:31 |
K900 | But throwing more builders at it won't change that | 15:38:39 |
K900 | In fact it'll make it worse | 15:38:43 |
Sami Liedes | Ah, so tests. | 15:38:44 |
Sami Liedes | Well, I guess there's no reason why that same couldn't be used for test-runners? Unless I still misunderstand gravely where the bottleneck is. | 15:39:04 |
K900 | No, tests are just particularly expensive to evaluate because they contain nested nixpkgs instantiations | 15:39:15 |
K900 | Evaluation needs to happen on a single machine | 15:39:24 |
K900 | Which is the coordinator | 15:39:27 |
K900 | And it has to be the coordinator unless Hydra is majorly rewritten | 15:39:40 |
Sami Liedes | Ah, I see. And it runs those in a single thread? Or, one evaluation at a time? | 15:40:22 |
Sami Liedes | Because it wants to guarantee that whatever gets out there passes. | 15:41:00 |