-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal iteration for &mut I
#100173
internal iteration for &mut I
#100173
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @scottmcm (or someone else) soon. Please see the contribution instructions for more information. |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit cb7f7ee with merge 3e685715a7ece536b2ab653e3433c06c00454bdf... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was tried several times before, the last one being #82185
Perhaps the changes on function.rs make a difference this time.
|
||
impl<'a, I: DoubleEndedIterator + Sized> ByRefRFold for &'a mut I { | ||
#[inline] | ||
default fn try_rfold<B, F, R>(&mut self, init: B, f: F) -> R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more specific impl shouldn't have default
☀️ Try build successful - checks-actions |
Queued 3e685715a7ece536b2ab653e3433c06c00454bdf with parent d77da9d, future comparison URL. |
Finished benchmarking commit (3e685715a7ece536b2ab653e3433c06c00454bdf): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
well, these don't look like the most promising results ^^' |
Presumably this makes compilation slower, but the perf tests don't show the effect on the performance of the compiled code, right? |
are there tests that do? |
Other than the std benches (which aren't great) we don't have anything automated to assess runtime performance. In the rustc-perf suite check and doc builds are the closest since they don't codegen but they're probably not diverse enough. You could try paring down the PR by splitting out some of the changes. E.g. some of the inlining in function.rs doesn't look relevant to iterators. You can also run rustc-perf locally and focus on that one benchmark, that should yield results more quickly (assuming you have a machine that can compile a stage1 rustc in a reasonable amount of time). |
rustc-perf seems to take forever on my machine and i can't display the results after it's finished. so that doesn't seem like a good option for me :/ |
It can be set to run a subset of the benchmarks, e.g. the serde ones. https://github.com/rust-lang/rustc-perf/tree/master/collector#benchmarking-options |
thanks for the tips! i managed to get it working thanks to your help. it seems that the biggest culprit was inlining the ops::function wrappers. |
I'm going to send this over to r? @m-ou-se because I think this is going to be as much a policy decision (about compile-vs-runtime) as it is about the code itself. |
Some changes were reverted, let's get new perf results. @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit f6a3462 with merge c20ee6d211784a78b94e26d37cce4e66acea976a... |
☀️ Try build successful - checks-actions |
Queued c20ee6d211784a78b94e26d37cce4e66acea976a with parent aeb5067, future comparison URL. |
Finished benchmarking commit (c20ee6d211784a78b94e26d37cce4e66acea976a): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
r? @the8472 |
The compile-time perf numbers are slightly negative, but less so than the previous attempt to do this. But we need some runtime benchmark numbers to verify that it brings the expected benefits. There are some core::iter benchmarks that I'd expect to show some speedup. @rustbot author |
@sarah-ek any updates on this? |
Closing this as inactive. Feel free to reöpen this pr or create a new pr if you get the time to work on this. Thanks |
this pr implements internal iteration for
&mut I
whenI: Sized
. it additionally inlines some wrapper functions that were not previously inline, which seems to speed things up by a fair amount in some cases.this lead to up to 3x performance gains across the board for
iter::
benches, with only a minor regression foriter::bench_filter_sum