| 4 Dec 2025 |
aloisw | Absolutely, the checkpointer needs to read the entire WAL and integrate it into the database. | 19:53:21 |
aloisw | aloisw@exodus ~> ls -lah /mnt/nix/var/nix/db
total 11G
drwxr-xr-x 2 aloisw users 111 Dec 4 20:49 .
drwxr-xr-x 6 aloisw users 79 Dec 4 20:49 ..
-rw------- 1 aloisw users 0 Dec 4 20:49 big-lock
-rw-r--r-- 1 aloisw users 128M Dec 4 20:53 db.sqlite
-rw-r--r-- 1 aloisw users 21M Dec 4 20:53 db.sqlite-shm
-rw-r--r-- 1 aloisw users 11G Dec 4 20:53 db.sqlite-wal
-rw------- 1 aloisw users 8.0M Dec 4 20:49 reserved
-rw-r--r-- 1 aloisw users 2 Dec 4 20:49 schema
Is the problem "the WAL is growing too fast"? | 19:53:52 |
aloisw | aloisw@exodus ~> ls -lah /mnt/nix/var/nix/db
total 11G
drwxr-xr-x 2 aloisw users 111 Dec 4 20:49 .
drwxr-xr-x 6 aloisw users 79 Dec 4 20:49 ..
-rw------- 1 aloisw users 0 Dec 4 20:49 big-lock
-rw-r--r-- 1 aloisw users 128M Dec 4 20:53 db.sqlite
-rw-r--r-- 1 aloisw users 21M Dec 4 20:53 db.sqlite-shm
-rw-r--r-- 1 aloisw users 11G Dec 4 20:53 db.sqlite-wal
-rw------- 1 aloisw users 8.0M Dec 4 20:49 reserved
-rw-r--r-- 1 aloisw users 2 Dec 4 20:49 schema
I wonder if the problem is "the WAL is growing too fast"? | 19:54:05 |
raitobezarius | would that mean checkpoint more frequently would fix that? | 19:55:01 |
raitobezarius | is the checkpoint freq automatically derived? | 19:55:10 |
aloisw | Maybe, it would reduce the latency definitely, but throughput only if the WAL stays in cache then I think. Also it adds more fsync which can slow you down again. | 19:56:19 |
aloisw | Yes, when the WAL grows too big, as determined by the wal_autocheckpoint pragma. | 19:57:26 |
| ellie changed their display name from Ellie (The Fake One) to ellie. | 19:57:32 |
aloisw | Which Lix sets to 40000, so it should be 160 MiB. | 19:59:20 |
aloisw | Hm, but the checkpointer shouldn't block others if I read the docs correctly? | 20:01:12 |
| pentane ⭔ changed their profile picture. | 20:02:32 |
aloisw | It seems that the writers just slow down massively, so possibly this is only indirectly related to the checkpointer falling behind by creating a huge WAL. | 20:09:25 |
Jassuko | Wtf is that WAL size?! :o | 21:06:37 |
raitobezarius | yeah but read perf deteriorates with the size of the WAL | 22:43:19 |
raitobezarius | have you tried a lower value? | 22:43:25 |
raitobezarius | but i guess it's really pesky | 22:44:19 |
raitobezarius | lix by nature in large substitution scenarios is read-write intensive | 22:44:29 |
raitobezarius | * lix by nature in large substitution/builds scenarios is read-write intensive | 22:44:34 |
raitobezarius | but i feel like the fact that lix is blocked by the potential event that the WAL contains a record relevant to it is a mistake given our usage of flock to mark the future happening of a store path | 22:45:19 |
raitobezarius | whereas for writes, it seems it'd be good if we could have multiple WAL so that once one is committed, the other can be still filled? | 22:45:42 |
raitobezarius | maybe we can improve things by initiating checkpoints ourselves at key points… | 22:46:15 |
raitobezarius | it would be interesting to know if we cause checkpoint starvation | 22:47:19 |
| 5 Dec 2025 |
aloisw | In reply to @raitobezarius:matrix.org have you tried a lower value? I have not, but I think the problem is not so much that the size is excessive, but that the checkpointer is never alone so the WAL grows without bound. | 06:26:19 |
aloisw | In reply to @raitobezarius:matrix.org but i feel like the fact that lix is blocked by the potential event that the WAL contains a record relevant to it is a mistake given our usage of flock to mark the future happening of a store path Thanks for the hint, I will investigate whether this has any influence on the situation. | 06:27:54 |
aloisw | In reply to @raitobezarius:matrix.org maybe we can improve things by initiating checkpoints ourselves at key points… SQLite does that because of autocheckpoint, but the problem is that these are non-blocking (PASSIVE) checkpoints. I will investigate tomorrow whether RESTART makes the situation better; it should prevent unbounded WAL growth at the expense of some concurrency, but maybe it's still a net win if the write transactions aren't slowed down so much. | 06:30:51 |
aloisw | In reply to @raitobezarius:matrix.org whereas for writes, it seems it'd be good if we could have multiple WAL so that once one is committed, the other can be still filled? wal2 when | 06:31:04 |
Jassuko | What exactly are you using the DB for in this case? | 09:15:48 |
raitobezarius | In reply to @jassu:kumma.juttu.asia What exactly are you using the DB for in this case? It's the metadata layer of the Nix store, the actual source of truth | 13:29:25 |
helle (just a stray cat girl) | we should just store it in xattrs (I am extremely joking) | 13:30:08 |
raitobezarius | In reply to @aloisw:julia0815.de wal2 when Are you Christian? We need christians to contribute to SQLite | 13:30:40 |