Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up encode() #20

Merged
merged 5 commits into from
Jul 15, 2024
Merged

Speed up encode() #20

merged 5 commits into from
Jul 15, 2024

Conversation

samtregar
Copy link
Contributor

This PR improves the speed of encode() by more than 2x. On a corpus of test data consisting of a mix of text and HTML totaling 476kb I get these results before my change:

Encoding: 1000 runs took 2.4 sec - 414.72 iterations per second

And after my change:

Encoding: 1000 runs took 0.9 sec - 1054.39 iterations per second

The test data is in the data directory here:

https://github.com/samtregar/zoomascii

The way this change works is by scanning ahead for runs of characters that do not need to be encoded and pushing them directly into the output. This is very common in the kinds of text that QP is used on, so it provides a big speedup. Encoding speed for QP encoding is important for applications like mass email sending, for example.

If it seems useful I can also commit my benchmarking binary, or I can include it on this PR if you'd like to play with it but don't think it's worth including in the repo.

@samtregar
Copy link
Contributor Author

samtregar commented Jul 14, 2024

Question - would you consider a bit of unsafe code to speed this up further? I can get another 50% speedup with something like this:

result.push_str(unsafe { &std::str::from_utf8_unchecked(&input[index..index + run_len]) });

That's better than pushing each byte onto the result one by one. It's safe if the index-finding logic is right and if the code ensures that that range contains only valid utf8 bytes. I believe we can rely on both, but I do understand if unsafe code doesn't seem worth it.

After that change, this version is just as fast as zoomascii, which uses a C implementation.

@staktrace
Copy link
Owner

Question - would you consider a bit of unsafe code to speed this up further?

Sorry, but no. I'd like to stay away from adding any unsafe code.

I spent some more time looking at this patch and while I believe it works correctly, it is still somewhat confusing to me. The confusion I have arises from the fact that run_len can be larger than max_run_len in the case I asked about above - when limit - on_line == 3, max_run_len gets set to 0 and run_len is initalized to 1. While I can't come up with an actual case this breaks on, it just seems like a hazard somehow. I'm trying to figure out if there's a way to tighten this up a bit and accomplish the same thing using just the while loop rather than if+while.

@staktrace
Copy link
Owner

In the end I left your structure as I couldn't find a cleaner way to combine the if+while, but I did adjust max_run_len to avoid the hazard.

Thank you for the patch!

@staktrace staktrace merged commit 8a550af into staktrace:master Jul 15, 2024
7 checks passed
@samtregar
Copy link
Contributor Author

In the end I left your structure as I couldn't find a cleaner way to combine the if+while, but I did adjust max_run_len to avoid the hazard.

Thank you for the patch!

Thank you! This was my first Rust project, happy it turned out to be useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants