| 17 May 2025 |
dramforever | which may be true | 17:08:25 |
@lotte:chir.rs | the first test was 128 bit ID generation (nanosecond system clock, 128 bit multiply, and an atomic memory raccess), the cm4 did it in 140ns average, the vf2 in 1.114μs average (~8x slower)
1MB of base64 encoding? 4.383ms on the cm4, 12.122ms on the vf2 (~3x slower) | 17:08:38 |
dramforever | it also just does not have vector instructions | 17:09:00 |
@lotte:chir.rs | lemme try openssl speed | 17:09:02 |
dramforever | so base64 checks out depending on implementation | 17:09:15 |
@lotte:chir.rs | the rust base64 raccrate | 17:09:23 |
dramforever | i have no idea what this compiles down to | 17:14:29 |
dramforever | but anyway don't expect to be able to replicate the speed ration on random benchmarks | 17:15:29 |
dramforever | * but anyway don't expect to be able to replicate the speed ratio on random benchmarks | 17:15:40 |
dramforever | not happening | 17:16:16 |
dramforever | as of uuid generation i wonder if it's going through vdso on arm and a syscall on riscv | 17:17:57 |
@lotte:chir.rs |
Doing 4096 bits private rsa sign ops for 10s: 142 4096 bits private RSA sign ops in 9.93s
| 17:22:52 |
@lotte:chir.rs | so fast | 17:22:53 |
dramforever | yeah not having vector instructions really helps | 17:23:48 |
@lotte:chir.rs | results are in ✨ | 17:30:16 |
@lotte:chir.rs | https://docs.google.com/spreadsheets/d/1xvuzBbQaWIGIrmKiYEHSkABQkhmKc8WNKTZWFyVgvqA/edit?usp=sharing | 17:30:17 |
@lotte:chir.rs | in general it seems that the single raccore performance of the vf2 is about half that of the cm4 | 17:30:45 |
@lotte:chir.rs | presumably both have working hardware aes, but the sha256 implementation seems to be 3x slower than on the cm4 | 17:31:40 |
Alex | Assuming the system is properly configured to use the hardware crypto (on either system). | 17:32:24 |
@lotte:chir.rs | default settings | 17:33:03 |
@lotte:chir.rs | the cm4 is running kde wayland and yes it is not that fast ™️ | 17:33:33 |
dramforever | In reply to @lotte:chir.rs presumably both have working hardware aes, but the sha256 implementation seems to be 3x slower than on the cm4 NOPE | 17:42:31 |
dramforever | no hardware AES on vf2 | 17:42:36 |
@lotte:chir.rs | how the hell is the cm4 slower then | 17:42:55 |
dramforever | i alsp doubt the crypto engine on vf2 is even supposed to be fast | 17:43:00 |
dramforever | magic | 17:43:17 |
dramforever | In reply to @dramforever:matrix.org no hardware AES on vf2 okay tbc i mean aes instructions | 17:44:21 |
dramforever | maybe they are using the crypto engine on vf2 and i was mistaken about its speed? | 17:46:00 |
@lotte:chir.rs | i’m still sad about the B extension in riscv because it had some Very Interesting instructions | 17:46:33 |
dramforever | zba_zbb is really nice | 17:47:06 |