-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safer implementation of RepeatN #130887
base: master
Are you sure you want to change the base?
Safer implementation of RepeatN #130887
Conversation
This comment has been minimized.
This comment has been minimized.
huh, I'm no llvm expert, buuut...
The other functions ( |
Maybe you can ping scottmcm (test author) to see if the change in codegen result is acceptable (then adjust the test.) |
@Soveu any updates on the failed test? thanks |
It is pretty neat, and in fact it's the implementation I originally wrote for this. The biggest issue I had with it then, though, is that it makes the iterator be potentially-undef, but because of the #130141 bug (which reminds me to go push on rust-lang/rfcs#3712 ) we're stuck with that problem anyway, so there's probably a way you could make this work. Looks like the problem is that it's having trouble figuring out that the store of zero and the store of count-1 can be merged, because otherwise there'd be no difference. Basically, what this test is testing is that for Nothing immediately jumps to mind for how to get LLVM to know that, though. Maybe try a separate I'd suggest copying the implementation to godbolt and playing around a bit there until you find an incantation that works. |
I have tried some things when first writing this code, with no effect, but the idea with fn next(&mut self) -> Option<A> {
let inner = self.inner.as_mut()?;
let count = inner.count.get();
if let Some(new_count) = NonZero::<usize>::new(count - 1) {
let tmp = inner.element.clone();
inner.count = new_count;
return Some(tmp);
}
return if core::mem::needs_drop::<A>() {
self.take_element()
} else {
let tmp = inner.element.clone();
self.inner = None;
Some(tmp)
};
} Godbolt: https://rust.godbolt.org/z/Kr198x635 LLVM output:
LLVM output for the current std implementation:
From what I see, |
That is interesting, just switching variables made it work 🤔 fn next(&mut self) -> Option<A> {
let inner = self.inner.as_mut()?;
let count = inner.count.get();
if let Some(new_count) = NonZero::<usize>::new(count - 1) {
// This must be in this order!
let tmp = inner.element.clone();
inner.count = new_count;
return Some(tmp);
}
return self.take_element();
} |
Definitely odd, but that's why we have codegen tests :P |
This comment has been minimized.
This comment has been minimized.
What that means is that it ended up getting a different layout -- originally it was storing the count at address, but apparently it flipped the order. I don't have a strong feeling about which order is better, but if you want you could put |
…epeatn, r=<try> Use `iter::repeat_n` to implement `Vec::extend_with` This replaces the `Vec::extend_with` manual implementation, which is used by `Vec::resize` and `Vec` `SpecFromElem`, with `iter::repeat_n`. I've compared the codegen output between: 1. the current `Vec::resize` impl 2. this branch 3. this branch + rust-lang#130887 3 gives the closest codegen output to 1, with some output improvements. 2 doesn't look good: https://rust.godbolt.org/z/Yrc83EhjY. May also help rust-lang#120050?
@rustbot label +S-waiting-on-review -S-waiting-on-author |
…epeatn, r=<try> Use `iter::repeat_n` to implement `Vec::extend_with` This replaces the `Vec::extend_with` manual implementation, which is used by `Vec::resize` and `Vec` `SpecFromElem`, with `iter::repeat_n`. I've compared the codegen output between: 1. the current `Vec::resize` impl 2. this branch 3. this branch + rust-lang#130887 3 gives the closest codegen output to 1, with some output improvements. 2 doesn't look good: https://rust.godbolt.org/z/Yrc83EhjY. May also help rust-lang#120050? --- WARNING: DO NOT MERGE - in order to run the perf run in rust-lang#133662 (comment) this PR currently also contains commits from rust-lang#130887
FYI we're also playing around with this change at #133662 since Compiler Explorer showed that the two MRs combined made |
Nice, this reminds me of my article on Rust Magazine: VecDeque::resize() optimization |
I've seen the "Use MaybeUninit for RepeatN" commit while reading This Week In Rust and immediately thought about something I've written some time ago - https://github.com/Soveu/repeat_finite/blob/master/src/lib.rs.
Using the fact, that
Option
will find niche in(T, NonZeroUsize)
, we can construct something that has the same size as(T, usize)
while completely getting rid ofMaybeUninit
.This leaves only
unsafe
onTrustedLen
, which is pretty neat.