NixOS Binary Cache Self-Hosting | 159 Members | |
| About how to host a very large-scale binary cache and more | 55 Servers |
| Sender | Message | Time |
|---|---|---|
| 24 Aug 2023 | ||
| well currently attic just stores a single signature for each narinfo (which is a database table entry, not an actual file), and that signature is generated by the server at upload time | 15:36:02 | |
| but it means that you have to trust the server much more than if the client could provide its own signatures | 15:36:24 | |
| makes sense, also what happens with key rotation on the server | 15:37:06 | |
| key rotation is generally a tricky topic with nix https://github.com/NixOS/rfcs/pull/149 | 15:37:58 | |
In reply to @linus:schreibt.jetzt The signature is generated at download time using the the per-cache private key managed by the server. It does store client-supplied signatures in the database but they aren't exposed at the moment. I wrote a bit more here: https://github.com/zhaofengli/attic/issues/80#issuecomment-1684347741 (oh right, forgot to respond) | 15:40:33 | |
| So we can easily have client-managed signatures now by changing how narinfo is generated (need to fix serialization for multiple signatures), but I'd prefer to have a complete story UX-wise (the client should be able to automatically sign on upload) | 15:42:27 | |
In reply to @brian:bmcgee.ieYou currently can rotate the server-managed key with attic cache configure --regenerate-keypair, but all clients who download need to reconfigure their trusted public keys | 15:44:35 | |
| So the resulting experience can be frustrating | 15:44:56 | |
In reply to @zhaofeng:zhaofeng.lioops, sorry for talking nonsense 🙃 | 15:54:44 | |
| Am I right in thinking, that with a custom client for pushing to the cache, that's where you're calculating the chunks / doing the heavy lifting for the de-duplication? Zhaofeng Li: | 17:57:05 | |
| Operating on the uncompressed store paths instead of having to decompress them on the server side | 17:57:43 | |
In reply to @brian:bmcgee.ieActually deduplication is handled server-side and is transparent to the client. As a result, the upload API is very simple and the client just streams the entire uncompressed NAR to the server | 19:09:49 | |
In reply to @zhaofeng:zhaofeng.li🤔 | 19:15:05 | |
| So you dedupe the nar archive not the uncompressed form? | 19:15:43 | |
| Naively I would have thought it better to operate on the uncompressed store path | 19:16:25 | |
| Ah wait, does nar offer any compression? Is that why you specify zstd or xz when working with the cache. | 19:18:24 | |
| * Ah wait, does nar offer any compression? Is that why you specify zstd or xz when working with a cache. | 19:18:52 | |
| Don't mind me, I'll go RTFM for a bit | 19:19:35 | |
In reply to @brian:bmcgee.ieYes, Attic chunks the entire NAR and not the individual files inside it. However, Tvix is doing fine-grained dedup IIRC. The current problem with this approach is non-free packages like VSCode often have thousands of data files that can make NAR reconstruction costly | 19:30:48 | |
In reply to @zhaofeng:zhaofeng.liMakes sense | 19:31:28 | |
| By chunking the entire NAR, those small files can be lumped together in a single chunk | 19:31:50 | |
| * By chunking the entire NAR, those small files can be lumped together in <del>a single chunk</del> just a few chunks | 19:32:14 | |
| flokli: do you have an alert set on tvix as a keyword? 😉 | 19:33:03 | |
| It's a conscious decision to have it that granular in Tvix - the idea is that we're able to heavily cache both individual chunks as well as rendered NARs. And long term have the clients be a bit smarter on that front too. | 19:33:50 | |
| You must have some stats now on the benefits/tradeoffs? | 19:34:53 | |
| If the clients themselves use that protocol as their underlying data model, they can cache these files on their own and take care of nar assembly locally, if needed at all. | 19:34:56 | |
| I had some people do some benchmarks on nix-casync, which also chunked whole nar files in its native format, and the chunks were not chunking nicely at file boundaries, often resulting in too large chunks, and shifted / missed dedup possibilitied | 19:35:57 | |
| In the long run, I think this is the way to go if more flexibility is provided. The current Nix isn't very "smart" with regards to binary substitution | 19:36:39 | |
| Again, as written earlier today, I'm aiming for efficient storage and transport, and assume you can cache both individual chunks retrieval and assembled nars if needed. | 19:36:41 | |
| * In the long run, I think this is the way to go if more flexibility is provided. The current Nix client isn't very "smart" with regards to binary substitution | 19:36:46 | |