| 25 May 2021 |
baloo | but getting there is slower | 23:33:46 |
baloo | the gcc build measures the compilation of gcc itself to determine hotpaths | 23:34:27 |
baloo | and feeds that to the compilation to reorder the output assembly | 23:34:43 |
andi- | is that different from their PGO? | 23:46:19 |
baloo | that's the pgo | 23:47:45 |
baloo | and I can't understand why it's not reproducible | 23:47:58 |
andi- | For the python PGO stuff I spotted a few time.time() calls which might make the profile output depend on the current time. Sometimes you might have more odd seconds or whatever which might slightly affect the formatting code or what not | 23:48:54 |
baloo | huum | 23:49:33 |
baloo | thing is | 23:49:35 |
baloo | it's reproducible on the same machine | 23:49:43 |
@grahamc:nixos.org | let's patch the kernel to only ever say it is 1 second past the epoch. always. | 23:49:51 |
baloo | but going from amd zen1 to zen3 breaks it | 23:49:56 |
andi- | Are they counting how long an instruction takes? | 23:50:27 |
andi- | And then try different versions of the same code? | 23:50:41 |
baloo | well ... looks like it, but the thing is: the documentation and the developers says no | 23:51:00 |
baloo | there is a specific build profile to get machine-optimized code | 23:51:25 |
baloo | and this is not the one we had | 23:51:33 |
baloo | (machine-optimized is based off perf) | 23:51:53 |
andi- | What profiles are there and can we maybe just specify one that is empty/trivial to reason about? | 23:52:11 |
andi- | Something like a no-op profile to figure out if that is reproducible | 23:52:43 |
baloo | well, there is the non-optimized (the one we now run, which is reproducible), the profiledbootstrap (the one we used to run), and the autoprofiled (perf based, which is deemed non-reproducible) | 23:54:48 |
andi- | Are we running the PGO part with just one build job at a time? I just came across a random pastebin stating it would get "confused" otherwise.... | 23:55:02 |
baloo | tried that without luck | 23:55:36 |
baloo | (tried to inject -j1 in the training stage (the one capturing the feedback)) | 23:56:16 |
baloo | my plan from this point, is to set up the build on two machines, and run it stage by stage | 23:56:34 |
baloo | and compare the intermediate result from one machine to the other | 23:56:43 |
baloo | and finger crossed find the issue | 23:56:53 |
baloo | but ... that's a lot of work | 23:57:04 |
andi- | I wonder if the kernel version leaks into the builds. | 23:57:50 |
baloo | same kernel version | 23:58:00 |