-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimise rfc3339 (and rfc2822) #844
Conversation
a2e61fc
to
431307c
Compare
Nice results! Is the datetime_from_str regression stable? Why did you move all that code out of the Debug impls into an inherent method? |
I was just adding some comments to the code 😅
The regression is a false positive. It's a little flaky on my hardware but I can't reproduce any conclusive evidence that it's slower on either branch It dances around between 122ns to 127ns for both 0.4.x and this branch. |
Ok, I'm done tinkering 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, it's nice to specialize the to_rfc*
implementations and that is an impressive performance uplift
So can you explain what changes caused the majority of the performance improvements? In particular I'm still unclear why it's good for performance to move the |
Using the fmt methods needs a Formatter type. We don't always have access to one in our implementations. Currently in std there's no way to use fmt into a String without using dynamic dispatch. This is what the generic I'm no longer sure if the byte arrays provide a huge benefit of speed, but it is the only way to avoid the dyn overhead of integer writing, so it remains for now. Initially the idea was that writing into a static buffer would reduce the number of allocation checks and potential reallocs, although considering we specify the capacity upfront I believe this is negligible |
I've just stumbled upon ufmt, which is a dyn-free formatting machinery library and it shows similar gains https://docs.rs/ufmt/latest/ufmt/#benchmarks Their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I reviewed this in more detail and I think all the changes here are good, but I would like to have it split up in smaller commits in order to review the changes more carefully. Probably one commit for moving things into write_into()
, also added a bunch of suggestions for things I think should be separate commits (separate PRs is also fine if you prefer).
I also think the docstrings you added need some work. Please write docstrings that describe the invariants/interface the function provides, instead of adding comments-as-docstrings. Preferably the first line of a docstring can stand alone and is no longer than a single line.
I would also like some benchmarks that compare writing into a byte array, then copying into a String
with directly writing into the String
(should probably reserve capacity up front). I'm definitely not a fan of the write_utf8_bytes()
strategy...
Thank you for working on this!
c23831b
to
8cc9249
Compare
8cc9249
to
c820718
Compare
I did some experimentation around this, and found:
|
@djc the commits are all there now, in small logical chunks with the benchmark results in the commit message. Hope that helps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @conradludgate - I'm happy with this as is but I've left a few comments to discuss prior to merging
@djc @conradludgate Any updates? (looking forward for the release #850 👀 ) |
5145475
to
e42fc89
Compare
e42fc89
to
c6bb0d1
Compare
Thanks for sticking with it! |
We use
to_rfc3339()
a lot in our observability libraries at TrueLayer. Upon profiling, I noticed that it took up a large portion of time, so I looked into optimising it.I was able to do more optimisations but the requirement of the year formatting causes the code to be pretty tricky. I can attempt it in a later PR if this is accepted.
In our observability code I know that the year is 0..=9999 and the timezone is Utc so I had a bit more gains from cheating 😅