| 4 Apr 2024 |
Steven Keuchel | Here are my estimates
On the pioneer:
Unregisterised release+profiled_libs: >30h
Unregisterised quick+no_profiled_libs: 18h
Registerised release+profiled_libs: 12h
Registerised quick+no_profiled_libs: 9h
Using qemu user-mode
Registerised release+profiled_libs: 8h
Registerised quick+no_profiled_libs: 6h
| 12:24:11 |
Alex | GHC is quite tricky to compile, so I'd be pleasantly surprised if Sulong were capable of handling it.
Historically, using Hugs to run GHC on itself has been an option, but AFAIK Hugs doesn't support 64-bit ISAs and it also has a relatively low limit on program size that makes bootstrapping GHC even on x86 a nightmare. I don't know what it would take to support RV64GC and I haven't explored patching Hugs to raise the program size limitations. | 12:24:58 |
Alex | Also Hugs requires an ancient version of GCC. | 12:25:47 |
Alex | Looking into Sulong, apparently it's not a Haskell compiler/interpreter but an LLVM bitcode interpreter?
That doesn't seem suitable for compiling GHC (Haskell code) from source.
LLVM bitcode isn't the problem here. | 12:29:37 |
Shalok Shalom | Graal and Sulong are able to produce a native image of Haskell code | 12:41:42 |
Shalok Shalom | Graal provides two runtimes: JVM and Truffle. Sulong is the LLVM implementation on Truffle | 12:42:09 |
Shalok Shalom | Hugs is even older than Eta, so I doubt very much it can compile any modern Haskell code at all? | 12:42:35 |
Pratham Patel (you can mention me) | In reply to @skeuchel:matrix.org Here are my estimates
On the pioneer: Unregisterised release+profiled_libs: >30h Unregisterised quick+no_profiled_libs: 18h Registerised release+profiled_libs: 12h Registerised quick+no_profiled_libs: 9h
Using qemu user-mode Registerised release+profiled_libs: 8h Registerised quick+no_profiled_libs: 6h
Yeah, the multi-core interconnects are only present to connect the cores, not much more. i.e. not how 64-cores are interconnected on threadrippers/eypcs;
So here, qemu-emulation on x86 will be faster tbh
| 12:43:10 |
Steven Keuchel | In reply to @thefossguy:matrix.org
Yeah, the multi-core interconnects are only present to connect the cores, not much more. i.e. not how 64-cores are interconnected on threadrippers/eypcs;
So here, qemu-emulation on x86 will be faster tbh
Most of the stuff I compile is quicker on the pioneer than user-mode emulations, so there's still something GHC-specific to it. Compiling w/o ilbnuma? Larger caches on x86? More "symbolic computations" in comparison to gcc? | 12:57:44 |
Pratham Patel (you can mention me) | There's obviously a lot of moving parts to this :) | 12:58:43 |
Pratham Patel (you can mention me) | What I meant to say was, you're not actually using all 64-cores on the pioneer "efficiently" because the interconnects aren't well. It's a first gen product. Impressive that they could even pull it off, a first gen product nonetheless. | 12:59:42 |
Alex | In reply to @shalokshalom:kde.org Hugs is even older than Eta, so I doubt very much it can compile any modern Haskell code at all? It doesn't need to. It only needs to be able to interpret an old version of GHC, then the build can work its way up to a modern GHC. | 13:43:26 |
Shalok Shalom | Yeah, true. | 14:03:45 |
Shalok Shalom | Well then, Eta might be a choice. It has a native Haskell compiler for 7 and even some features of 8, probably better than Hugs 🤷 | 14:05:07 |
| rtunreal joined the room. | 15:23:24 |
| jopejoe1 (4094@epvpn) joined the room. | 16:22:35 |
tau | Redacted or Malformed Event | 18:06:30 |
tau | sorry not what i meant to post | 18:06:52 |
tau | i finally got a build failure that was just tests | 18:07:10 |
tau | this is going to make things difficult,,, | 18:07:19 |
tau | catch2-3.5.2-riscv64-linux | 18:07:20 |
sorear | the C910 has two classes of issues:
- non-standard features for things that eventually became standard, i.e. RVV 0.7.1, MAEE instead of Svpbmt (needed for PCIe because Intel didn't distinguish BARs by memory type, irrelevant otherwise), T-Head performance counters, etc. None of this affects software that doesn't opt in to using the pre-standard feature, and none of it is morally different from any other non-standard extension, which are likely to be ubiquitous
- ordinary bugs - unreasonably slow contended memory access without fences, wrong decoding of noncanonical fences, wrong FP underflow flag - there is no reason to believe that future, more complicated cores will have fewer total bugs, even if they fix the current bugs, and also no evidence that the C910's crop of bugs can cause successful builds with miscompilations
| 18:24:49 |
sorear | I can't imagine a consistent standard which would (a) rule out the use of the C910 for non-test builds (b) not rule out every other piece of hardware which exists in the past and future for the same reason | 18:25:50 |
Pratham Patel (you can mention me) | The SiFive J74 cores in the Unmatched and the VF2 are pretty spec compliant AFAIK | 18:30:07 |
sorear | are those the ones that can't correctly handle a sfence instruction with a nonzero virtual address and get the trap PC wrong when you jump to a negative noncanonical VA? | 18:32:03 |
sorear | sifive is better at communicating errata in english, I'll give them that much | 18:32:36 |
sorear | I specifically want to avoid a policy which requires no public errata, because that will just push us towards vendors that treat all errata as trade secrets | 18:33:04 |
Pratham Patel (you can mention me) | me not know that, I’ll look into that tomorrow | 18:34:07 |
Alex | In reply to @hive:the-apothecary.club catch2-3.5.2-riscv64-linux I've already encountered this build failure. Here's a fix (probably suboptimal, but it works). | 19:18:21 |
tau | thanks :fold | 19:19:44 |