| 29 Mar 2022 |
toonn | I think I'll just try both and do some minimal benchmarking. | 12:54:13 |
atemu12 | toonn: At higher LZMA levels, that's actually not necessarily true IIRC. Also, a user hacking on the stdenv likely unpacks the tarball more often than they download it and likely needs to download so many source files that the difference between LZMA and zstd is a drop in the water | 12:56:22 |
toonn | It's also the bandwidth from the project's perspective, of course. | 12:57:36 |
toonn | And the tarball only needs to be unpacked once. Unless the result is garbage collected. But the tarball itself would probably also be GCed, I suppose. | 12:58:18 |
toonn | I don't think a crazy level of either compressor would be appropriate. | 13:00:22 |
toonn | According to the Lzip author Zstd isn't very suitable for archival either, https://lists.gnu.org/archive/html/lzip-bug/2016-10/msg00005.html (The interest in archival is my own, of course, I like the idea of the Nixpkgs cache being append-only forever : ) I don't know whether the corruption-resistance is an advantage at the scale of Darwin users, but maybe?) | 13:05:06 |
Foxboron | In reply to @toonn:matrix.org According to the Lzip author Zstd isn't very suitable for archival either, https://lists.gnu.org/archive/html/lzip-bug/2016-10/msg00005.html (The interest in archival is my own, of course, I like the idea of the Nixpkgs cache being append-only forever : ) I don't know whether the corruption-resistance is an advantage at the scale of Darwin users, but maybe?) Arch does long-term archiving of all packages. xz previously and zstd currently. Not aware of any corruption issues but i don't think anyone has checked either | 13:41:10 |
atemu12 | Also, those integrity concerns don't apply to here since the binary seed will be declared by hash anyways. | 13:43:07 |
tpw_rules | toonn: i thought the bzip choice was motivated by what was available on darwin by default | 13:44:08 |
tpw_rules | i also had always wondered about the lzip author's motivation, the claims seem slightly overblown but eh. i'd probably pick xz because i'm pretty sure it's on darwin too | 13:44:39 |
tpw_rules | but otoh i think xz is non-reproducible in multithreaded mode | 13:48:34 |
toonn | Does Darwin come with xz? It didn't use to. | 14:01:31 |
toonn | Even then I don't think it's a necessary prerequisite. | 14:01:51 |
tpw_rules | how is it not? how does that flow work? | 14:09:14 |
tpw_rules | also maybe it doesn't, it looks like i have it installed via homebrew | 14:09:36 |
atemu12 | xz is not on Darwin system either. | 14:19:01 |
atemu12 | In reply to @tpw_rules:matrix.org but otoh i think xz is non-reproducible in multithreaded mode I think this is the most important of any of the compressor up- and downside mentioned yet. Nondeterminism due to compression is not something we want I think | 14:20:03 |
atemu12 | * xz is not on my Darwin system either. | 14:20:59 |
toonn | tpw_rules: It doesn't really matter because cpio and bzip2 are currently kinda also part of the bootstrap tarball, separately but still. | 14:24:12 |
toonn | tpw_rules: https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/darwin/default.nix#L18-L20= | 14:25:10 |
tpw_rules | ok. i don't know exactly how the bootstrap extraction works. | 16:26:38 |
tpw_rules | zstd and i think lzip are both reproducible. but i think neither guarantee it if the compressor version changes | 16:27:16 |
| 30 Mar 2022 |
| dtz joined the room. | 02:31:56 |
| @etrigan63:matrix.org left the room. | 03:09:45 |
| 31 Mar 2022 |
toonn | Done some initial "benchmarks", nix-build --check always says it's not reproducible with any format. So I guess the source material isn't reproducible yet. Is it still worth making the change from cpio -> tar then? | 13:14:38 |
tpw_rules | for the macos bootstrap? are you sure it's not cpio including inodes or bzip2 including a timestamp? that would fail reproducibility every time | 17:17:45 |