| 25 May 2021 |
baloo | thing is | 23:49:35 |
baloo | it's reproducible on the same machine | 23:49:43 |
@grahamc:nixos.org | let's patch the kernel to only ever say it is 1 second past the epoch. always. | 23:49:51 |
baloo | but going from amd zen1 to zen3 breaks it | 23:49:56 |
andi- | Are they counting how long an instruction takes? | 23:50:27 |
andi- | And then try different versions of the same code? | 23:50:41 |
baloo | well ... looks like it, but the thing is: the documentation and the developers says no | 23:51:00 |
baloo | there is a specific build profile to get machine-optimized code | 23:51:25 |
baloo | and this is not the one we had | 23:51:33 |
baloo | (machine-optimized is based off perf) | 23:51:53 |
andi- | What profiles are there and can we maybe just specify one that is empty/trivial to reason about? | 23:52:11 |
andi- | Something like a no-op profile to figure out if that is reproducible | 23:52:43 |
baloo | well, there is the non-optimized (the one we now run, which is reproducible), the profiledbootstrap (the one we used to run), and the autoprofiled (perf based, which is deemed non-reproducible) | 23:54:48 |
andi- | Are we running the PGO part with just one build job at a time? I just came across a random pastebin stating it would get "confused" otherwise.... | 23:55:02 |
baloo | tried that without luck | 23:55:36 |
baloo | (tried to inject -j1 in the training stage (the one capturing the feedback)) | 23:56:16 |
baloo | my plan from this point, is to set up the build on two machines, and run it stage by stage | 23:56:34 |
baloo | and compare the intermediate result from one machine to the other | 23:56:43 |
baloo | and finger crossed find the issue | 23:56:53 |
baloo | but ... that's a lot of work | 23:57:04 |
andi- | I wonder if the kernel version leaks into the builds. | 23:57:50 |
baloo | same kernel version | 23:58:00 |
baloo | (on the one I tried) | 23:58:11 |
andi- | two machines with the same CPU result in the same output? | 23:58:20 |
andi- | (might be harder for us to test unless we rent some machines temprorary) | 23:58:33 |
baloo | I haven't tried | 23:58:33 |
andi- | * (might be harder for us to test unless we rent some machines temporary) | 23:58:38 |
baloo | yeah I can rent two machines I guess | 23:58:46 |
andi- | That might be something grahamc (he/him) could probably help with. Packet boxes might be really nice for this. | 23:59:15 |
baloo | yeah :D | 23:59:31 |