Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Better performance for larger inputs #70

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

malvidin
Copy link
Contributor

Performance for ~30 bytes and larger inputs should be faster. It is much faster for 256 bytes, and should be able to handle inputs in the MB range.

@keis
Copy link
Owner

keis commented Dec 1, 2022

This is interesting for sure. But I don't think adding this complexity for the case of bigger payloads really makes sense.

result.append(mod)

return b'\0' * (origlen - newlen) + bytes(reversed(result))
return acc.to_bytes(origlen - newlen + (acc.bit_length() + 7) // 8, "big")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alone seems to have a pretty good impact and even simplifies the code 😍

@malvidin
Copy link
Contributor Author

malvidin commented Dec 3, 2022

I am decoding base58 inputs that are ~200 characters long (~150 bytes) that I do not control, and want to mitigate the potential for performance issues if larger values are seen in the future.

The nested loop is O(n^2), and becomes closer to O(n^3) for large inputs. Using gmpy2 is consistently fast, but slower for inputs that are about half the size of a bitcoin address.

When gmpy is not used, splitting the inputs approximately in half on pre-computed powers of 2, 45, or 58, it moves closer to O(n^2*log(n)). By avoiding the Karatsuba cutoff for the nested loop, Karatsuba large integer multiplications are performed in a divide and conquer manner.

The change adds complexity, but it provides better performance for inputs up to ~2MB.

Add optional gmpy2.mpz for even faster encode/decode
Add longer random benchmark
Apply Black code style
Quiet mypy errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants