!CcTBuBritXGywOEGWJ:matrix.org

NixOS Binary Cache Self-Hosting

175 Members
About how to host a very large-scale binary cache and more61 Servers

Load older messages


SenderMessageTime
21 Aug 2023
@linus:schreibt.jetzt@linus:schreibt.jetzt
attic=# select sum(file_size) / (select sum(nar_size) from nar) from chunk;
        ?column?
------------------------
 0.20098691256561418637
(1 row)
08:52:57
@linus:schreibt.jetzt@linus:schreibt.jetztso I guess I'm using 20% as much storage for nars as I would be without dedup08:53:15
@linus:schreibt.jetzt@linus:schreibt.jetzt*and compression08:53:30
@linus:schreibt.jetzt@linus:schreibt.jetztof course that isn't entirely representative of how much space the cache takes up including file metadata and the database08:54:44
@linus:schreibt.jetzt@linus:schreibt.jetztbut I think nars would be the bulk of my binary cache, so that's already a useful number ^^08:55:09
@julienmalka:matrix.orgJulien Given the state of things with AWS hosting it would be interesting to see how much of that dedup we could benefit from in the cache.nixos.org binary cache without asking for demesurate computationnal stress to the server 08:55:07
@linus:schreibt.jetzt@linus:schreibt.jetztyeah08:55:57
@linus:schreibt.jetzt@linus:schreibt.jetztI know some people were interested in looking into options there, do you know if anyone's actually doing anything on that front?08:56:25
@julienmalka:matrix.orgJulien To be honnest I thought that that was what people were doing on the group but I was only following the conversation from afar. I’m interested into doing experiments on that field though. 08:57:40
@linus:schreibt.jetzt@linus:schreibt.jetzt I guess raitobezarius is the most likely to know :D 08:58:26
@elvishjerricco:matrix.org@elvishjerricco:matrix.org Linux Hackerman: Any suggestion on special_small_blocks value? I was thinking 8K since anything <=8K is worst case space efficiency on raidz2 (ashift=12) 09:04:45
@julienmalka:matrix.orgJulien
In reply to @linus:schreibt.jetzt
attic=# select sum(file_size) / (select sum(nar_size) from nar) from chunk;
        ?column?
------------------------
 0.20098691256561418637
(1 row)
On my own attic deployment I get 0.17052771681498935
09:04:49
@linus:schreibt.jetzt@linus:schreibt.jetzt
In reply to @elvishjerricco:matrix.org
Linux Hackerman: Any suggestion on special_small_blocks value? I was thinking 8K since anything <=8K is worst case space efficiency on raidz2 (ashift=12)
sounds reasonable, and I don't think many narinfos will be bigger than 8K
09:05:31
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgplus I think optane has 512B sectors, so ashift=9 on that vdev šŸ˜Ž09:05:39
@linus:schreibt.jetzt@linus:schreibt.jetztbut are you going to use raidz2 on your special too?09:05:39
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgyou can't use raidz on special I'm pretty sure09:05:53
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgashift=9 on the special just means that small things get even smaller09:06:41
@linus:schreibt.jetzt@linus:schreibt.jetzt
In reply to @elvishjerricco:matrix.org
you can't use raidz on special I'm pretty sure
ah yeah
09:07:03
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgI've only got two of these optanes in here, so my redundancy level is sort of different. But it's optane, so it should resilver fast in the event of a failure, and they should be durable as hell. So it's not too big a worry09:08:00
@linus:schreibt.jetzt@linus:schreibt.jetztyeah, and losing a binary cache isn't the end of the world I think?09:08:31
@linus:schreibt.jetzt@linus:schreibt.jetztand you'd have backups of any more important stuff, right? ;)09:08:43
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgright; granted there's other stuff I like on this pool but none of it matters09:08:45
@elvishjerricco:matrix.org@elvishjerricco:matrix.organd yes :P09:08:53
@elvishjerricco:matrix.org@elvishjerricco:matrix.orgother machine has a single very big hard drive as a send/recv target09:09:28
@linus:schreibt.jetzt@linus:schreibt.jetzt
In reply to @julienmalka:matrix.org
On my own attic deployment I get 0.17052771681498935
you can also replace file_size with chunk_size to see the ratio for dedup-only (without compression)
09:09:47
@linus:schreibt.jetzt@linus:schreibt.jetztwhich is like 47% for me09:09:55
@linus:schreibt.jetzt@linus:schreibt.jetzt * which is like 45% for me09:10:05
@julienmalka:matrix.orgJulien
In reply to @linus:schreibt.jetzt
which is like 45% for me
Yeah 43%
09:11:33
@julienmalka:matrix.orgJulienSo this is still a big time storage economy09:12:15
@elvishjerricco:matrix.org@elvishjerricco:matrix.org

Linux Hackerman: Oh wow yea. Tested out copying a big closure from the cache and the "querying path" stuff was very noticeably stop-and-go and took a good amount of time.

I've now added optane special metadata and migrated the data with a send/recv to and from the same pool. I used a completely different closure to make sure it wasn't just remembering the cache hit, and that querying part is effectively instant now

11:24:55

Show newer messages


Back to Room ListRoom Version: 10