| 29 Mar 2022 |
Linux Hackerman | aah ok | 12:41:51 |
toonn | atemu12: IIRC Nix uses libarchive now. That covers pretty much every reasonable compression except Brotli. | 12:42:37 |
@rnhmjoj:maxwell.ydns.eu | this is an article by the author of lzip that goes through the problems with xz: http://web.archive.org/web/20220128214314/https://www.nongnu.org/lzip/xz_inadequate.html | 12:43:01 |
atemu12 | toonn: Good to know, thanks! | 12:43:57 |
Linux Hackerman | I've learnt something new today! Thanks all :) | 12:44:29 |
atemu12 | toonn: In that case, an LZMA compressor like lzip or XZ would be best if you want best compression with slow but not unreasonably slow speeds. Otherwise, zstd. | 12:45:07 |
atemu12 | toonn: Though if it's only needed by people who re-build the stdenv, they'd appreciate the orders of magnitude in speed of zstd more than the few MiB saved by LZMA | 12:47:17 |
toonn | atemu12: The idea is that network bandwidth makes a bigger difference than the decompression speed. | 12:53:19 |
toonn | I think I'll just try both and do some minimal benchmarking. | 12:54:13 |
atemu12 | toonn: At higher LZMA levels, that's actually not necessarily true IIRC. Also, a user hacking on the stdenv likely unpacks the tarball more often than they download it and likely needs to download so many source files that the difference between LZMA and zstd is a drop in the water | 12:56:22 |
toonn | It's also the bandwidth from the project's perspective, of course. | 12:57:36 |
toonn | And the tarball only needs to be unpacked once. Unless the result is garbage collected. But the tarball itself would probably also be GCed, I suppose. | 12:58:18 |
toonn | I don't think a crazy level of either compressor would be appropriate. | 13:00:22 |
toonn | According to the Lzip author Zstd isn't very suitable for archival either, https://lists.gnu.org/archive/html/lzip-bug/2016-10/msg00005.html (The interest in archival is my own, of course, I like the idea of the Nixpkgs cache being append-only forever : ) I don't know whether the corruption-resistance is an advantage at the scale of Darwin users, but maybe?) | 13:05:06 |
Foxboron | In reply to @toonn:matrix.org According to the Lzip author Zstd isn't very suitable for archival either, https://lists.gnu.org/archive/html/lzip-bug/2016-10/msg00005.html (The interest in archival is my own, of course, I like the idea of the Nixpkgs cache being append-only forever : ) I don't know whether the corruption-resistance is an advantage at the scale of Darwin users, but maybe?) Arch does long-term archiving of all packages. xz previously and zstd currently. Not aware of any corruption issues but i don't think anyone has checked either | 13:41:10 |
atemu12 | Also, those integrity concerns don't apply to here since the binary seed will be declared by hash anyways. | 13:43:07 |
tpw_rules | toonn: i thought the bzip choice was motivated by what was available on darwin by default | 13:44:08 |
tpw_rules | i also had always wondered about the lzip author's motivation, the claims seem slightly overblown but eh. i'd probably pick xz because i'm pretty sure it's on darwin too | 13:44:39 |
tpw_rules | but otoh i think xz is non-reproducible in multithreaded mode | 13:48:34 |
toonn | Does Darwin come with xz? It didn't use to. | 14:01:31 |
toonn | Even then I don't think it's a necessary prerequisite. | 14:01:51 |
tpw_rules | how is it not? how does that flow work? | 14:09:14 |
tpw_rules | also maybe it doesn't, it looks like i have it installed via homebrew | 14:09:36 |
atemu12 | xz is not on Darwin system either. | 14:19:01 |
atemu12 | In reply to @tpw_rules:matrix.org but otoh i think xz is non-reproducible in multithreaded mode I think this is the most important of any of the compressor up- and downside mentioned yet. Nondeterminism due to compression is not something we want I think | 14:20:03 |
atemu12 | * xz is not on my Darwin system either. | 14:20:59 |
toonn | tpw_rules: It doesn't really matter because cpio and bzip2 are currently kinda also part of the bootstrap tarball, separately but still. | 14:24:12 |
toonn | tpw_rules: https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/darwin/default.nix#L18-L20= | 14:25:10 |
tpw_rules | ok. i don't know exactly how the bootstrap extraction works. | 16:26:38 |
tpw_rules | zstd and i think lzip are both reproducible. but i think neither guarantee it if the compressor version changes | 16:27:16 |