-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disarm mem::uninitialized by having it initialize to an arbitrary valid value for each type #87675
Comments
What if there is no valid value? enum Never {}
let x: Never = unsafe { core::mem::uninitialized() }; currently helpfully panics with
|
I suspect it would not be a problem for it to continue to panic when used with uninhabited types. |
An |
Sure, the intrinsic could be exposed via something like |
@rustbot claim |
@rustbot release-assignment |
I am taking another shot at something like this in #99182, with a slightly different approach than #87032:
It's not quite a "valid" value, but that is not entirely possible anyway: we promise to LLVM that references are dereferenceable (whether they are used or not), so many existing uses of |
mem::uninitialized: mitigate many incorrect uses of this function Alternative to rust-lang/rust#98966: fill memory with `0x01` rather than leaving it uninit. This is definitely bitewise valid for all `bool` and nonnull types, and also those `Option<&T>` that we started putting `noundef` on. However it is still invalid for `char` and some enums, and on references the `dereferenceable` attribute is still violated, so the generated LLVM IR still has UB -- but in fewer cases, and `dereferenceable` is hopefully less likely to cause problems than clearly incorrect range annotations. This can make using `mem::uninitialized` a lot slower, but that function has been deprecated for years and we keep telling everyone to move to `MaybeUninit` because it is basically impossible to use `mem::uninitialized` correctly. For the cases where that hasn't helped (and all the old code out there that nobody will ever update), we can at least mitigate the effect of using this API. Note that this is *not* in any way a stable guarantee -- it is still UB to call `mem::uninitialized::<bool>()`, and Miri will call it out as such. This is somewhat similar to rust-lang/rust#87032, which proposed to make `uninitialized` return a buffer filled with 0x00. However - That PR also proposed to reduce the situations in which we panic, which I don't think we should do at this time. - The 0x01 bit pattern means that nonnull requirements are satisfied, which (due to references) is the most common validity invariant. `@5225225` I hope I am using `cfg(sanitize)` the right way; I was not sure for which ones to test here. Cc rust-lang/rust#66151 Fixes rust-lang/rust#87675
For a while it has been understood that the
mem::uninitialized
API is broken. Originally the intuitive understanding of this API was that it produced a fixed, arbitrary value. However (as extensively discussed elsewhere) uninitialized memory is not a “fixed, arbitrary value”, and that for nearly all types in Rust it is instantaneous undefined behavior for them to be uninitialized.What’s worse, even initialized values can be insta-UB. Rust uses its understanding of valid bit patterns to perform layout optimizations whereby invalid values can be repurposed as enum tags, which is how
Option<&T>
can be only a single word. Thus evenmem::uninitialized
’s siblingmem::zeroed
is insta-UB when used with types like&T
.As a result,
mem::uninitialized
was deprecated and replaced withmem::MaybeUninit
, which avoids the problems of the former. In addition, bothmem::zeroed
andmem::uninitialized
were altered such that they will attempt to detect (and panic) when used on certain types: the former on types that must not be zero, and the latter on any types with invalid (defined) values.However, implementing these panic checks caused a great deal of breakage (which arguably is desirable for safety, although still extremely disruptive), and to reduce disruption the check is conservative instead of exhaustive (#66151). Unfortunately, while improving the coverage of these checks will still leave
mem::zeroed
as perfectly usable,mem::uninitialized
will be rendered all but unusable, as essentially all types cannot ever be in an uninitialized state.This is a problem for legacy crates that were never migrated away from
mem::uninitialized
. However, there is a solution that both allows these legacy crates to compile while also avoiding the problem of invalid uninitialized values:mem::uninitialized
can initialize with a valid value. This may seem contrary to the original intent of the API, but consider that the only reason to avoid initialization is performance, and that the choice is now between “my code doesn’t compile”, “my code contains undefined behavior”, and “my code is slower”; the latter is the most desirable outcome of the three. This raises the question: what value to initialize with? PR #87032 proposed the simplest option, which was to replace the innards ofmem::uninitialized
withmem::zeroed
, however zero is the value that is most often used for niche optimizations, so this would still reject a lot of code.But there is a more desirable alternative. Because Rust understands what values are invalid for a type—it must, in order to perform niche optimizations—it therefore should also understand which values are valid for a type. An intrinsic could be added to the compiler which, given a type, produces an arbitrary valid value of that type. This intrinsic could be used within
mem::uninitialized
, and the existing panic check could be removed. This would allow all code in the wild still usingmem::uninitialized
to compile, and would also avoid all insta-UB related to validity invariants.The text was updated successfully, but these errors were encountered: