Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate: Benchmarks to compare with/without this proposal #3

Open
alexcrichton opened this issue Aug 14, 2024 · 5 comments
Open

Evaluate: Benchmarks to compare with/without this proposal #3

alexcrichton opened this issue Aug 14, 2024 · 5 comments

Comments

@alexcrichton
Copy link
Collaborator

Original development of this proposal benchmarked the blind-sig benchmark in Sightglass as well as the fibonacci benchmarks from the Rust num-bigint repository.

This issue is intended to serve as a location for others to drop interesting benchmark programs as well so they can be collected to help evaluate this proposal over time. If you've got a benchmark you'd like to see added it would ideally be in C or Rust at this time and is ideally a program that has a means of self-reporting its execution time. High-level ideas are ok to but would require some more work to create a reproducible benchmark.

@alexcrichton
Copy link
Collaborator Author

One area that would be particularly interesting to have benchmarks for are programs that require good performance of overflowing/saturating/checked arithmetic which isn't related to 128-bit. This would help stress the need for either 128-bit operations or overflow-flag-returning-instructions.

@alexcrichton
Copy link
Collaborator Author

A suggestion here is that -ftrapv can inject checked arithmetic for C and UBSan might rely on this heavily. A naive benchmark didn't show much performance difference relative to native without this proposal, however.

@marcusdarmstrong
Copy link

Just found out about this proposal today, but I've got a strong benchmark candidate here if you're still looking—XXH3 is critically reliant on a wide u64 mul operation for low-input-sizes, and thus our existing manual WASM implementation demonstrates worse performance than the older, non-vectorized XXH64 algorithm.

@CryZe
Copy link

CryZe commented Nov 18, 2024

Yeah, it's not just XXH3, but rustc-hash, foldhash, aHash, wyhash, rapidhash, MUM Hash, umash and many more that all rely on 128-bit widening multiplication, basically all the fastest non-failing hashing algorithms that don't use AES in the SMhasher benchmarks.

@alexcrichton
Copy link
Collaborator Author

Thanks @marcusdarmstrong and @CryZe! It'll be a bit easier to test and confirm in a few months once rustc and LLVM both support wide-arithmetic, but i64.mul_wide_u should be perfect for 128-bit-widening-multiplication. Historical benchmarks have all shown that the wasm instructions are suitable for matching native performance in these situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants