19 Sep 2024 |
| @memegames99:matrix.org joined the room. | 01:51:37 |
| @memegames99:matrix.org set a profile picture. | 02:02:25 |
| @memegames99:matrix.org changed their profile picture. | 02:02:31 |
K900 | https://hydra.nixos.org/build/273031249/nixlog/6 | 03:14:12 |
K900 | Looks like the same failure mode as reported earlier | 03:14:23 |
K900 | It is a channel blocker so if we don't find a solution quickly I'm tempted to revert | 03:17:18 |
| loudgolem joined the room. | 03:34:47 |
aleksana (force me to bed after 18:00 UTC) | In reply to @mightyiam:matrix.org Robert Hensing (roberth): I don't think I can reproduce. I have no clue as to any specific circumstance. Sorry. Is there anything else I can do? Could you post your flake | 03:39:03 |
aleksana (force me to bed after 18:00 UTC) | Maybe run it like 100 times and with different memory limit | 03:39:39 |
Shahar "Dawn" Or (mightyiam) | The flake we were working on is private, sorry. Not sure what more I can do. Perhaps it will occur again... | 03:53:33 |
aleksana (force me to bed after 18:00 UTC) | I'm running the VM test to see if it's stable or has a high possibility to trigger failure | 03:54:26 |
aleksana (force me to bed after 18:00 UTC) | If is then we can try to get it to print stack trace with debug info | 03:54:58 |
aleksana (force me to bed after 18:00 UTC) | No luck | 03:57:52 |
aleksana (force me to bed after 18:00 UTC) | In reply to @k900:0upti.me https://hydra.nixos.org/build/273031249/nixlog/6 If there's neither stable reproduction of the sigfault nor stack trace, we're not even completely sure it's brought by a Nix update and not something else like gcc, glibc from a staging-next cycle? | 04:04:15 |
aleksana (force me to bed after 18:00 UTC) | x2 no repro | 04:07:38 |
aleksana (force me to bed after 18:00 UTC) | x3 | 04:10:55 |
aleksana (force me to bed after 18:00 UTC) | x4, well I don't think this gonna work, maybe some stress tests on builtins.concatLists ? | 04:16:35 |
puck | got it reproduced locally by pure chance; smells like a GC bug, likely to do with concatLists being passed a list of three lists? | 04:33:31 |
aleksana (force me to bed after 18:00 UTC) | Note that the function void EvalState::concatLists(Value & v, size_t nrLists, Value * * lists, const PosIdx pos, std::string_view errorCtx) was changed a bit after Nix 2.22, in commit https://github.com/NixOS/nix/commit/fecff520d7ce6598319862efc50c2dc6e1f6e9d9#diff-f118e4c6f6e02148b887fdf627352311fca5a3a4eadf0b4a9d9f348e0be464ffR1949 | 04:40:25 |
aleksana (force me to bed after 18:00 UTC) | And a mkList helper function was added | 04:41:08 |
aleksana (force me to bed after 18:00 UTC) | In reply to @puck:puck.moe got it reproduced locally by pure chance; smells like a GC bug, likely to do with concatLists being passed a list of three lists? Did you have a more minimal reproducer tho? | 04:43:19 |
puck | nope! just my local system config | 04:44:07 |
Lily Foster | For what it's worth, i'd been experiencing list corruption errors in CI as well since 2.23~2.24 (i'm not sure when exactly) almost daily in https://github.com/lilyinstarlight/foosteros before i finally gave up CI'ing against cppnix at all. I've no clue if it's related, but here if anyone wants to see an example of a CI run that failed with nonsensical list mis-evaluation weeks ago that then succeeded on a subsequent rerun (and this was happening constantly so i can provide multiple examples. i forget how varied the examples were): https://github.com/lilyinstarlight/foosteros/actions/runs/10822449565/job/30026445898#step:6:4654 | 04:48:12 |
Lily Foster | Actually it looks like that CI run specifically was 2.22.1 from the published release binary tarball (https://releases.nixos.org/nix/nix-2.22.1/nix-2.22.1-x86_64-linux.tar.xz): https://github.com/lilyinstarlight/foosteros/actions/runs/10822449565/job/30026445898#step:3:127 | 04:50:31 |
puck | oof, if that's the same issue i'm a bit worried about possible silent misevaluations | 04:51:56 |
aleksana (force me to bed after 18:00 UTC) | It looks like the memory has crossed the boundary but has not crossed the boundary to the outside of thread, just happen to read another string? | 04:53:25 |
aleksana (force me to bed after 18:00 UTC) | (I am not particularly familiar with this area | 04:53:58 |
Lily Foster | (i'd originally tested my config against dev builds via https://github.com/nix-community/nix-unstable-installer to catch these bugs early, but eval was so constantly regressed on HEAD for long enough that i also disabled it too several months ago) | 04:53:59 |
Lily Foster | * (i'd originally tested my config against dev builds via https://github.com/nix-community/nix-unstable-installer to catch these bugs early before release, but eval was so constantly regressed on HEAD for long enough that i also disabled it too several months ago) | 04:54:13 |
Lily Foster | (but i might be able to dig up logs from those runs before that too if possible regression points against unreleased commit hash might be helpful) | 04:55:39 |