21 Aug 2023 |
| Christina Sørensen joined the room. | 08:50:44 |
@linus:schreibt.jetzt | In reply to @julienmalka:matrix.org Do we have measures of the total deduplication of an attic cache ? attic doesn't currently have a built-in command for it but it should be pretty easy to get from the db | 08:51:14 |
@linus:schreibt.jetzt | attic=# select sum(file_size) / (select sum(nar_size) from nar) from chunk;
?column?
------------------------
0.20098691256561418637
(1 row)
| 08:52:57 |
@linus:schreibt.jetzt | so I guess I'm using 20% as much storage for nars as I would be without dedup | 08:53:15 |
@linus:schreibt.jetzt | *and compression | 08:53:30 |
@linus:schreibt.jetzt | of course that isn't entirely representative of how much space the cache takes up including file metadata and the database | 08:54:44 |
@linus:schreibt.jetzt | but I think nars would be the bulk of my binary cache, so that's already a useful number ^^ | 08:55:09 |
Julien | Given the state of things with AWS hosting it would be interesting to see how much of that dedup we could benefit from in the cache.nixos.org binary cache without asking for demesurate computationnal stress to the server | 08:55:07 |
@linus:schreibt.jetzt | yeah | 08:55:57 |
@linus:schreibt.jetzt | I know some people were interested in looking into options there, do you know if anyone's actually doing anything on that front? | 08:56:25 |
Julien | To be honnest I thought that that was what people were doing on the group but I was only following the conversation from afar. I’m interested into doing experiments on that field though. | 08:57:40 |
@linus:schreibt.jetzt | I guess raitobezarius is the most likely to know :D | 08:58:26 |
@elvishjerricco:matrix.org | Linux Hackerman: Any suggestion on special_small_blocks value? I was thinking 8K since anything <=8K is worst case space efficiency on raidz2 (ashift=12) | 09:04:45 |
Julien | In reply to @linus:schreibt.jetzt
attic=# select sum(file_size) / (select sum(nar_size) from nar) from chunk;
?column?
------------------------
0.20098691256561418637
(1 row)
On my own attic deployment I get 0.17052771681498935 | 09:04:49 |
@linus:schreibt.jetzt | In reply to @elvishjerricco:matrix.org Linux Hackerman: Any suggestion on special_small_blocks value? I was thinking 8K since anything <=8K is worst case space efficiency on raidz2 (ashift=12) sounds reasonable, and I don't think many narinfos will be bigger than 8K | 09:05:31 |
@elvishjerricco:matrix.org | plus I think optane has 512B sectors, so ashift=9 on that vdev 😎 | 09:05:39 |
@linus:schreibt.jetzt | but are you going to use raidz2 on your special too? | 09:05:39 |
@elvishjerricco:matrix.org | you can't use raidz on special I'm pretty sure | 09:05:53 |
@elvishjerricco:matrix.org | ashift=9 on the special just means that small things get even smaller | 09:06:41 |
@linus:schreibt.jetzt | In reply to @elvishjerricco:matrix.org you can't use raidz on special I'm pretty sure ah yeah | 09:07:03 |
@elvishjerricco:matrix.org | I've only got two of these optanes in here, so my redundancy level is sort of different. But it's optane, so it should resilver fast in the event of a failure, and they should be durable as hell. So it's not too big a worry | 09:08:00 |
@linus:schreibt.jetzt | yeah, and losing a binary cache isn't the end of the world I think? | 09:08:31 |
@linus:schreibt.jetzt | and you'd have backups of any more important stuff, right? ;) | 09:08:43 |
@elvishjerricco:matrix.org | right; granted there's other stuff I like on this pool but none of it matters | 09:08:45 |
@elvishjerricco:matrix.org | and yes :P | 09:08:53 |
@elvishjerricco:matrix.org | other machine has a single very big hard drive as a send/recv target | 09:09:28 |
@linus:schreibt.jetzt | In reply to @julienmalka:matrix.org On my own attic deployment I get 0.17052771681498935 you can also replace file_size with chunk_size to see the ratio for dedup-only (without compression) | 09:09:47 |
@linus:schreibt.jetzt | which is like 47% for me | 09:09:55 |
@linus:schreibt.jetzt | * which is like 45% for me | 09:10:05 |
Julien | In reply to @linus:schreibt.jetzt which is like 45% for me Yeah 43% | 09:11:33 |