| 21 Oct 2025 |
niksnut | it's a result of having a large contiguous allocation (so it would also affect std::vector<char>) | 20:56:21 |
Sergei Zimmerman (xokdvium) | Yeah, it's much better to allocate once and reuse that allocation as you did there | 20:56:45 |
niksnut | a data type that consists of a vector of buffers would avoid this problem | 20:57:06 |
Sergei Zimmerman (xokdvium) | I did a similar cleanup in filetransfer..cc at some point | 20:57:07 |
Sergei Zimmerman (xokdvium) | You'd still benefit from allocating it only once though | 20:57:23 |
Sergei Zimmerman (xokdvium) | Going through malloc/free is still a function call through PLT and doing that in a loop is kind of expensive anyway | 20:58:05 |
Sergei Zimmerman (xokdvium) | https://llvm.org/docs/ProgrammersManual.html#vector | 20:58:58 |
niksnut | it's funny though that the optimization that skipped parsing 15 GB of NARs (https://github.com/DeterminateSystems/nix-src/pull/238/commits/1f8d587a0df8f9de366640831dade43d17021c30) had basically no observable effect | 21:01:55 |
niksnut | it's completely dwarfed by the memory allocation / page fault overhead | 21:02:07 |
Sergei Zimmerman (xokdvium) | Yeah that will do it certainly. Also having a too large buffer on the stack is bad:
https://github.com/NixOS/nix/pull/13877 | 21:03:27 |
Sergei Zimmerman (xokdvium) | The stack pointer does get decremented one page at a time. Is that the default behavior or some hardening flag? | 21:04:10 |
niksnut | surprising since stack pages should stay around once paged in | 21:07:22 |
niksnut | though there is some overhead to handle guard pages | 21:07:42 |
niksnut | so it has to touch at least 1 byte every 4096 bytes | 21:07:52 |
Sergei Zimmerman (xokdvium) | Yeah that was the overhead. A loop over all the pages | 21:08:28 |
Sergei Zimmerman (xokdvium) | 1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
| 21:08:42 |
niksnut | right, that's to avoid a segfault if you have guard pages enabled (which I think is the default) | 21:09:13 |
niksnut | I would expect the overhead for that loop to be pretty trivial though | 21:09:31 |
niksnut | in the case where the pages are present | 21:09:52 |
| 22 Oct 2025 |
| 0xcafca changed their profile picture. | 10:21:53 |
| 0xcafca changed their profile picture. | 10:23:31 |
tomberek | @niksnut:matrix.org: builtins.fetchTree cannot take advantage of the "__final" optimization. This means usages of flake-compat will re-fetch inputs unnecessarily. Is there a way to expose `prim_fetchFinalTree`. This can create a large performance regression. | 15:11:06 |
niksnut | I think we should allow fetchTree { final = true; ... } | 15:11:43 |
roberth | I get what it does but I never felt like I had a complete understanding somehow. If we were wrong about final we could always design something better without the pressure and call it fetchSource :) | 15:14:03 |
niksnut | final just means it won't add more attributes | 15:17:39 |
tomberek | @roberthensing:matrix.org: is the concern that it would be abused ir ossify some behavior? | 15:26:18 |
roberth | I guess I just expected it to be prettier | 15:26:44 |
roberth | sometimes things just aren't, and that's ok | 15:27:35 |
tomberek | I suspect this can get better with lazy paths/trees, but that seems to be further in the future. | 15:28:00 |
roberth | the basic question seems to be: do we want to trust the lock file, and I think usually the answer is yes, final = true; | 15:29:24 |