| 21 Oct 2025 |
Sergei Zimmerman (xokdvium) | Going through malloc/free is still a function call through PLT and doing that in a loop is kind of expensive anyway | 20:58:05 |
Sergei Zimmerman (xokdvium) | https://llvm.org/docs/ProgrammersManual.html#vector | 20:58:58 |
Eelco | it's funny though that the optimization that skipped parsing 15 GB of NARs (https://github.com/DeterminateSystems/nix-src/pull/238/commits/1f8d587a0df8f9de366640831dade43d17021c30) had basically no observable effect | 21:01:55 |
Eelco | it's completely dwarfed by the memory allocation / page fault overhead | 21:02:07 |
Sergei Zimmerman (xokdvium) | Yeah that will do it certainly. Also having a too large buffer on the stack is bad:
https://github.com/NixOS/nix/pull/13877 | 21:03:27 |
Sergei Zimmerman (xokdvium) | The stack pointer does get decremented one page at a time. Is that the default behavior or some hardening flag? | 21:04:10 |
Eelco | surprising since stack pages should stay around once paged in | 21:07:22 |
Eelco | though there is some overhead to handle guard pages | 21:07:42 |
Eelco | so it has to touch at least 1 byte every 4096 bytes | 21:07:52 |
Sergei Zimmerman (xokdvium) | Yeah that was the overhead. A loop over all the pages | 21:08:28 |
Sergei Zimmerman (xokdvium) | 1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
| 21:08:42 |
Eelco | right, that's to avoid a segfault if you have guard pages enabled (which I think is the default) | 21:09:13 |
Eelco | I would expect the overhead for that loop to be pretty trivial though | 21:09:31 |
Eelco | in the case where the pages are present | 21:09:52 |
| 22 Oct 2025 |
| 0xcafca changed their profile picture. | 10:21:53 |
| 0xcafca changed their profile picture. | 10:23:31 |
tomberek | @niksnut:matrix.org: builtins.fetchTree cannot take advantage of the "__final" optimization. This means usages of flake-compat will re-fetch inputs unnecessarily. Is there a way to expose `prim_fetchFinalTree`. This can create a large performance regression. | 15:11:06 |
Eelco | I think we should allow fetchTree { final = true; ... } | 15:11:43 |
Robert Hensing (roberth) | I get what it does but I never felt like I had a complete understanding somehow. If we were wrong about final we could always design something better without the pressure and call it fetchSource :) | 15:14:03 |
Eelco | final just means it won't add more attributes | 15:17:39 |
tomberek | @roberthensing:matrix.org: is the concern that it would be abused ir ossify some behavior? | 15:26:18 |
Robert Hensing (roberth) | I guess I just expected it to be prettier | 15:26:44 |
Robert Hensing (roberth) | sometimes things just aren't, and that's ok | 15:27:35 |
tomberek | I suspect this can get better with lazy paths/trees, but that seems to be further in the future. | 15:28:00 |
Robert Hensing (roberth) | the basic question seems to be: do we want to trust the lock file, and I think usually the answer is yes, final = true; | 15:29:24 |
Robert Hensing (roberth) | if you can't trust your lock file, you're either editing it by hand, which you shouldn't do, or letting people you don't trust update, which you know, you'd have bigger problems | 15:29:52 |
Robert Hensing (roberth) | I feel like true should probably be the default in a future version of this primop if we have one | 15:30:55 |
Robert Hensing (roberth) | anyway, making final part of the public interface seems fine to me | 15:31:52 |
Robert Hensing (roberth) | iirc a lazier fetchTree would be a mitigation for the performance loss but not a complete fix | 15:32:17 |
tomberek | Would it make sense for the final to be default if fetchTree has been given enough attributes? (If provided with all the info.) | 15:43:55 |