Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid creating Boxes of uninitalized values in RawVec #61230

Merged
merged 3 commits into from
May 28, 2019

Conversation

matklad
Copy link
Member

@matklad matklad commented May 27, 2019

RawVec<bool>::into_box is definitely instant UB, if not all values are initialized.

See https://gankro.github.io/blah/initialize-me-maybe/

@rust-highfive
Copy link
Collaborator

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 27, 2019
@matklad
Copy link
Member Author

matklad commented May 27, 2019

r? @RalfJung

@rust-highfive rust-highfive assigned RalfJung and unassigned KodrAus May 27, 2019
@matklad
Copy link
Member Author

matklad commented May 27, 2019

Note: I haven't actually audited the uses of this method, but it seems unlikely that they aren't fine, b/c we run tests under miri.

src/liballoc/raw_vec.rs Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

Note: I haven't actually audited the uses of this method, but it seems unlikely that they aren't fine, b/c we run tests under miri.

I am not sure how great our test coverage is here.^^ Also, "invalid data in a Box" is (like "invalid data behind a reference") an open question, and Miri currently does not check validity behind references and boxes. That would be extremely expensive.

So, it might be worth mentioning that the UB this talks about is a still-under-discussion point, similar to what I did in MaybeUninit:

(Notice that the rules around uninitialized integers are not finalized yet, but until they are, it is advisable to avoid them.)

And it's probably worth auditing the uses. There are not many (at least on in liballoc+libstd).

@matklad
Copy link
Member Author

matklad commented May 27, 2019

And it's probably worth auditing the uses.

Heh, you are 100% right. Here's at least one case where we do copy_from_slice for an slice of uninit values:

rust/src/liballoc/boxed.rs

Lines 549 to 550 in be10e62

let mut boxed = unsafe { RawVec::with_capacity(slice.len()).into_box() };
boxed.copy_from_slice(slice);
.

Should that code be (optimistically) rewritten to use RawVec::ptr and ptr::copy_nonoverlapping, or should we wait until we figure out if materializing an &mut[undef] is ok?

@RalfJung
Copy link
Member

I'd prefer rewriting it -- also to see what the ergonomic impact is.

@matklad matklad changed the title Fix misleading comment on RawVec WIP: avoid creating Boxes of uninitalized values in RawVec May 27, 2019
@matklad matklad changed the title WIP: avoid creating Boxes of uninitalized values in RawVec avoid creating Boxes of uninitalized values in RawVec May 27, 2019
@matklad matklad force-pushed the ub-comment branch 2 times, most recently from 83847db to 83d85e5 Compare May 27, 2019 08:10
@matklad
Copy link
Member Author

matklad commented May 27, 2019

That usage turned out the singe one where we expose uninit memory

Things I've looked at
10:45:55|~/projects/rust/tt-parser/src/liballoc|ub-comment✓
λ rg into_box\( -F
vec.rs
638:            buf.into_box()

raw_vec.rs
696:    pub unsafe fn into_box(self) -> Box<[T]> {

boxed.rs
402:            from_boxed_utf8_unchecked(buf.into_box())
553:            buf.into_box()
876:        return unsafe { new.into_box() };
885:            unsafe fn into_box(self) -> Box<[T]> {
888:                raw.into_box()

11:03:17|~/projects/rust/tt-parser/src/liballoc|ub-comment⚡*
λ cd ../libstd/

11:05:06|~/projects/rust/tt-parser/src/libstd|ub-comment⚡*
λ rg into_box\( -F
ffi/os_str.rs
344:        let rw = Box::into_raw(self.inner.into_box()) as *mut OsStr;
663:        let rw = Box::into_raw(s.inner.into_box()) as *mut OsStr;

sys_common/os_str_bytes.rs
117:    pub fn into_box(self) -> Box<Slice> {
160:    pub fn into_box(&self) -> Box<Slice> {

sys_common/wtf8.rs
358:    pub fn into_box(self) -> Box<Wtf8> {
640:    pub fn into_box(&self) -> Box<Wtf8> {

sys/windows/os_str.rs
112:    pub fn into_box(self) -> Box<Slice> {
113:        unsafe { mem::transmute(self.inner.into_box()) }
153:    pub fn into_box(&self) -> Box<Slice> {
154:        unsafe { mem::transmute(self.inner.into_box()) }

I've rewritten the docs to just make "elements must be initialized" the invariant of the method itself, with a copy-pasted note that the language-level rule is under construction.

Don't know if the patch works: compiling LLVM 😭

@matklad
Copy link
Member Author

matklad commented May 27, 2019

tests pass locally

unsafe {
ptr::copy_nonoverlapping(self.as_ptr(), buf.ptr(), len);
from_boxed_utf8_unchecked(buf.into_box())
from_boxed_utf8_unchecked(self.as_bytes().into())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But where does this even allocate and copy now...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the From<[u8]> for Box<[u8]> impl, which is also adjusted in this PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the impl I am talking about is the generic one over T

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you call Box<_>::from(...) instead of .into() to make that clearer? And maybe move that out of the unsafe block to make it clear that that is not a weird thing that "steals" self or so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, fixed

@RalfJung
Copy link
Member

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented May 27, 2019

📌 Commit fe31ad3 has been approved by RalfJung

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 27, 2019
Centril added a commit to Centril/rust that referenced this pull request May 28, 2019
avoid creating Boxes of uninitalized values in RawVec

`RawVec<bool>::into_box` is definitely instant UB, if not all values are initialized.

See https://gankro.github.io/blah/initialize-me-maybe/
bors added a commit that referenced this pull request May 28, 2019
Rollup of 9 pull requests

Successful merges:

 - #61084 (Clarify docs for unreachable! macro)
 - #61220 (Added error message for E0284)
 - #61227 (Use .await syntax instead of await!)
 - #61230 (avoid creating Boxes of uninitalized values in RawVec)
 - #61237 (Updated the Iterator docs with information about overriding methods.)
 - #61241 (Check place iterative)
 - #61242 (Make dest_needs_borrow iterate instead of recurse)
 - #61247 (Make eval_place iterate instead of recurse)
 - #61248 (Use Place::local)

Failed merges:

r? @ghost
bors added a commit that referenced this pull request May 28, 2019
Rollup of 9 pull requests

Successful merges:

 - #61084 (Clarify docs for unreachable! macro)
 - #61220 (Added error message for E0284)
 - #61227 (Use .await syntax instead of await!)
 - #61230 (avoid creating Boxes of uninitalized values in RawVec)
 - #61237 (Updated the Iterator docs with information about overriding methods.)
 - #61241 (Check place iterative)
 - #61242 (Make dest_needs_borrow iterate instead of recurse)
 - #61247 (Make eval_place iterate instead of recurse)
 - #61248 (Use Place::local)

Failed merges:

r? @ghost
@bors bors merged commit fe31ad3 into rust-lang:master May 28, 2019
@matklad matklad deleted the ub-comment branch May 28, 2019 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants