!VRULIdgoKmKPzJZzjj:nixos.org

Nix Hackers

915 Members
For people hacking on the Nix package manager itself191 Servers

Load older messages


SenderMessageTime
19 Sep 2024
@aleksana:mozilla.orgFind me at aleksana:qaq.liIf is then we can try to get it to print stack trace with debug info03:54:58
@aleksana:mozilla.orgFind me at aleksana:qaq.liNo luck03:57:52
@aleksana:mozilla.orgFind me at aleksana:qaq.li
In reply to @k900:0upti.me
https://hydra.nixos.org/build/273031249/nixlog/6
If there's neither stable reproduction of the sigfault nor stack trace, we're not even completely sure it's brought by a Nix update and not something else like gcc, glibc from a staging-next cycle?
04:04:15
@aleksana:mozilla.orgFind me at aleksana:qaq.lix2 no repro04:07:38
@aleksana:mozilla.orgFind me at aleksana:qaq.lix304:10:55
@aleksana:mozilla.orgFind me at aleksana:qaq.li x4, well I don't think this gonna work, maybe some stress tests on builtins.concatLists? 04:16:35
@puck:puck.moepuck got it reproduced locally by pure chance; smells like a GC bug, likely to do with concatLists being passed a list of three lists? 04:33:31
@aleksana:mozilla.orgFind me at aleksana:qaq.li Note that the function void EvalState::concatLists(Value & v, size_t nrLists, Value * * lists, const PosIdx pos, std::string_view errorCtx) was changed a bit after Nix 2.22, in commit https://github.com/NixOS/nix/commit/fecff520d7ce6598319862efc50c2dc6e1f6e9d9#diff-f118e4c6f6e02148b887fdf627352311fca5a3a4eadf0b4a9d9f348e0be464ffR1949 04:40:25
@aleksana:mozilla.orgFind me at aleksana:qaq.li And a mkList helper function was added 04:41:08
@aleksana:mozilla.orgFind me at aleksana:qaq.li
In reply to @puck:puck.moe
got it reproduced locally by pure chance; smells like a GC bug, likely to do with concatLists being passed a list of three lists?
Did you have a more minimal reproducer tho?
04:43:19
@puck:puck.moepucknope! just my local system config04:44:07
@lily:lily.flowersLily Foster For what it's worth, i'd been experiencing list corruption errors in CI as well since 2.23~2.24 (i'm not sure when exactly) almost daily in https://github.com/lilyinstarlight/foosteros before i finally gave up CI'ing against cppnix at all. I've no clue if it's related, but here if anyone wants to see an example of a CI run that failed with nonsensical list mis-evaluation weeks ago that then succeeded on a subsequent rerun (and this was happening constantly so i can provide multiple examples. i forget how varied the examples were): https://github.com/lilyinstarlight/foosteros/actions/runs/10822449565/job/30026445898#step:6:4654 04:48:12
@lily:lily.flowersLily Foster Actually it looks like that CI run specifically was 2.22.1 from the published release binary tarball (https://releases.nixos.org/nix/nix-2.22.1/nix-2.22.1-x86_64-linux.tar.xz): https://github.com/lilyinstarlight/foosteros/actions/runs/10822449565/job/30026445898#step:3:127 04:50:31
@puck:puck.moepuckoof, if that's the same issue i'm a bit worried about possible silent misevaluations04:51:56
@aleksana:mozilla.orgFind me at aleksana:qaq.liIt looks like the memory has crossed the boundary but has not crossed the boundary to the outside of thread, just happen to read another string?04:53:25
@aleksana:mozilla.orgFind me at aleksana:qaq.li(I am not particularly familiar with this area04:53:58
@lily:lily.flowersLily Foster (i'd originally tested my config against dev builds via https://github.com/nix-community/nix-unstable-installer to catch these bugs early, but eval was so constantly regressed on HEAD for long enough that i also disabled it too several months ago) 04:53:59
@lily:lily.flowersLily Foster * (i'd originally tested my config against dev builds via https://github.com/nix-community/nix-unstable-installer to catch these bugs early before release, but eval was so constantly regressed on HEAD for long enough that i also disabled it too several months ago)04:54:13
@lily:lily.flowersLily Foster(but i might be able to dig up logs from those runs before that too if possible regression points against unreleased commit hash might be helpful)04:55:39
@puck:puck.moepuck
In reply to @puck:puck.moe
oof, if that's the same issue i'm a bit worried about possible silent misevaluations
looks like in your case the value pointed to in the list passed to concatLists got replaced with another value; but in my case it seems the entire list's elems was replaced with a tApp Value
04:58:40
@aleksana:mozilla.orgFind me at aleksana:qaq.li* It looks like the pointer has crossed the boundary but has not crossed the boundary to the outside of thread, just happen to read another string?05:11:08
@k900:0upti.meK900So what I'm getting here is that no one fully understands the bug yet 05:52:33
@k900:0upti.meK900And it involves GC05:52:38
@k900:0upti.meK900Which brings me back to my original question 05:52:47
@k900:0upti.meK900Do we revert again 05:52:51
@aleksana:mozilla.orgFind me at aleksana:qaq.liHow do we make sure that the bug definitely doesn't happen with nix 2.18 and newer libraries in tree06:02:37
@k900:0upti.meK900Which libraries? 06:07:03
@aleksana:mozilla.orgFind me at aleksana:qaq.li
In reply to @k900:0upti.me
Which libraries?
compiler, libc, bohemgc, other stuff
06:12:22
@k900:0upti.meK900Compiler and libc have not been touched in a long time06:14:29
@k900:0upti.meK900And nothing broke there06:14:33

Show newer messages


Back to Room ListRoom Version: 6