| 24 Aug 2023 |
Zhaofeng Li | So the resulting experience can be frustrating | 15:44:56 |
@linus:schreibt.jetzt | In reply to @zhaofeng:zhaofeng.li
The signature is generated at download time using the the per-cache private key managed by the server. It does store client-supplied signatures in the database but they aren't exposed at the moment. I wrote a bit more here: https://github.com/zhaofengli/attic/issues/80#issuecomment-1684347741
(oh right, forgot to respond)
oops, sorry for talking nonsense 🙃 | 15:54:44 |
BMG | Am I right in thinking, that with a custom client for pushing to the cache, that's where you're calculating the chunks / doing the heavy lifting for the de-duplication? Zhaofeng Li: | 17:57:05 |
BMG | Operating on the uncompressed store paths instead of having to decompress them on the server side | 17:57:43 |
Zhaofeng Li | In reply to @brian:bmcgee.ie Am I right in thinking, that with a custom client for pushing to the cache, that's where you're calculating the chunks / doing the heavy lifting for the de-duplication? Zhaofeng Li: Actually deduplication is handled server-side and is transparent to the client. As a result, the upload API is very simple and the client just streams the entire uncompressed NAR to the server | 19:09:49 |
BMG | In reply to @zhaofeng:zhaofeng.li Actually deduplication is handled server-side and is transparent to the client. As a result, the upload API is very simple and the client just streams the entire uncompressed NAR to the server 🤔 | 19:15:05 |
BMG | So you dedupe the nar archive not the uncompressed form? | 19:15:43 |
BMG | Naively I would have thought it better to operate on the uncompressed store path | 19:16:25 |
BMG | Ah wait, does nar offer any compression? Is that why you specify zstd or xz when working with the cache. | 19:18:24 |
BMG | * Ah wait, does nar offer any compression? Is that why you specify zstd or xz when working with a cache. | 19:18:52 |
BMG | Don't mind me, I'll go RTFM for a bit | 19:19:35 |
Zhaofeng Li | In reply to @brian:bmcgee.ie So you dedupe the nar archive not the uncompressed form? Yes, Attic chunks the entire NAR and not the individual files inside it. However, Tvix is doing fine-grained dedup IIRC. The current problem with this approach is non-free packages like VSCode often have thousands of data files that can make NAR reconstruction costly | 19:30:48 |
BMG | In reply to @zhaofeng:zhaofeng.li Yes, Attic chunks the entire NAR and not the individual files inside it. However, Tvix is doing fine-grained dedup IIRC. The current problem with this approach is non-free packages like VSCode often have thousands of data files that can make NAR reconstruction costly Makes sense | 19:31:28 |
Zhaofeng Li | By chunking the entire NAR, those small files can be lumped together in a single chunk | 19:31:50 |
Zhaofeng Li | * By chunking the entire NAR, those small files can be lumped together in <del>a single chunk</del> just a few chunks | 19:32:14 |
BMG | flokli: do you have an alert set on tvix as a keyword? 😉 | 19:33:03 |
flokli | It's a conscious decision to have it that granular in Tvix - the idea is that we're able to heavily cache both individual chunks as well as rendered NARs. And long term have the clients be a bit smarter on that front too. | 19:33:50 |
BMG | You must have some stats now on the benefits/tradeoffs? | 19:34:53 |
flokli | If the clients themselves use that protocol as their underlying data model, they can cache these files on their own and take care of nar assembly locally, if needed at all. | 19:34:56 |
flokli | I had some people do some benchmarks on nix-casync, which also chunked whole nar files in its native format, and the chunks were not chunking nicely at file boundaries, often resulting in too large chunks, and shifted / missed dedup possibilitied | 19:35:57 |
Zhaofeng Li | In the long run, I think this is the way to go if more flexibility is provided. The current Nix isn't very "smart" with regards to binary substitution | 19:36:39 |
flokli | Again, as written earlier today, I'm aiming for efficient storage and transport, and assume you can cache both individual chunks retrieval and assembled nars if needed. | 19:36:41 |
Zhaofeng Li | * In the long run, I think this is the way to go if more flexibility is provided. The current Nix client isn't very "smart" with regards to binary substitution | 19:36:46 |
flokli | Zhaofeng Li: I wanted to start with a http binary cache proxy that speaks tvix-store protocol over the wire and keeps a local cache of chunks and renders nar files just for a local Nix. | 19:37:49 |
flokli | I'm not sure I want to even expose a NAR interface publicly to clients, they can just substitute through some local proxy. | 19:38:26 |
flokli | I don't need to provide cache.nixos.org, so I can be a bit more demanding on what clients need to have locally. And most of the time my crappy internet connection is the limiting factor, so having to download less over that is actually both good for my data usage as well as the server hosting the tvix-store, win win ;-) | 19:40:50 |
raitobezarius | I want fast fast fast | 19:41:25 |
raitobezarius | I have 5Gbps | 19:41:27 |
raitobezarius | please give me 5Gbps saturating cache | 19:41:33 |
flokli | raitobezarius: what about 100G | 19:42:01 |