-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Padding considered uninitialized data when type is transmuted in const fn #131569
Comments
Interestingly, if i replace the slice cast & slice access by a ptr cast and ptr write i get this error instead: #![feature(const_ptr_write)]
struct Buf<const N: usize> {
bytes: [u8; N],
cursor: usize,
}
impl<const N: usize> Buf<N> {
const fn new() -> Self {
Self {
bytes: [0u8; N],
cursor: 0,
}
}
const fn push_frame_slice(&mut self, slice: &[Frame]) {
if self.bytes.len() - self.cursor < slice.len() {
panic!("exceeded capacity");
}
let mut i = 0;
while i < slice.len() {
let frame = slice[i];
let ptr = unsafe { self.bytes.as_mut_ptr().add(self.cursor) as *mut Frame };
unsafe { ptr.write(frame) };
self.cursor += std::mem::size_of::<Frame>();
i += 1;
}
}
}
#[derive(Copy, Clone)]
#[repr(u8)]
enum Frame {
First(u16),
Second,
}
const FRAMES: &[Frame] = &[Frame::First(8), Frame::Second];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
let mut buf = Buf::<NB_BYTES>::new();
buf.push_frame_slice(FRAMES);
buf.bytes
}; Output
So in other words, the zeroed bytes that are not explicitely part of the enum as considered uninitialized, even if I use |
The diagnostic in your second comment is about the padding between the
Is byte 0, for the Your Is there official documentation that led you to conclude that |
There are two major crates for doing these kinds of reinterpreting reference transmutes, |
Not that's not what i'm saying, I'm saying the padding is interpreted as uninitialized, as you can see in the second example. I don't understand why padding would be considered uninitialized. Moreover this only occurs at compile time. I'll update my example for clarity. |
I have updated the example. |
@saethlin Using For instance /// Cast `A` into `B`
///
/// ## Panics
///
/// * This is like [`try_cast`](try_cast), but will panic on a size mismatch.
#[inline]
pub(crate) unsafe fn cast<A: Copy, B: Copy>(a: A) -> B {
if size_of::<A>() == size_of::<B>() {
unsafe { transmute!(a) }
} else {
something_went_wrong("cast", PodCastError::SizeMismatch)
}
} Aka transmute, so there's no way around it. |
Padding should be zeroes, not |
The issue isn't that there is padding, its that the padding is considered |
Here's the same example with an Code
use static_assertions::assert_eq_size;
struct Buf<const N: usize> {
bytes: [u8; N],
cursor: usize,
}
impl<const N: usize> Buf<N> {
const fn new() -> Self {
Self {
bytes: [0u8; N],
cursor: 0,
}
}
const fn push_u32_slice(&mut self, slice: &[u32]) {
if self.bytes.len() - self.cursor < slice.len() {
panic!("exceeded capacity");
}
let mut i = 0;
while i < slice.len() {
let array = slice[i].to_le_bytes();
let mut j = 0;
while j < slice.len() {
self.bytes[self.cursor + i] = array[j];
j += 1;
}
i += 1;
}
self.cursor += slice.len();
}
}
#[repr(C)]
struct Frame {
kind: FrameKind,
data: FrameData,
}
#[repr(u8)]
enum FrameKind {
First,
Second,
}
#[repr(C)]
union FrameData {
a: u16,
b: (),
}
assert_eq_size!(Frame, u32);
impl Frame {
const fn cast_slice(slice: &[Frame]) -> &[u32] {
// SAFETY We know the slice is valid and casting to bytes should
// always be valid, even if repr(rust) isn't stable yet.
unsafe { std::mem::transmute(slice) }
}
}
const FRAMES: &[Frame] = &[
Frame {
kind: FrameKind::First,
data: FrameData {
a: 8,
},
},
Frame {
kind: FrameKind::Second,
data: FrameData {
b: (),
},
},
];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
let mut buf = Buf::<NB_BYTES>::new();
let bytes = Frame::cast_slice(FRAMES);
buf.push_u32_slice(&bytes);
buf.bytes
}; Output
Same issue, where the padding in the union is considered uninitialized. Edit1: Added collapsible code to reduce clutter |
And here's the same example using a struct with padding: Code
use static_assertions::assert_eq_size;
struct Buf<const N: usize> {
bytes: [u8; N],
cursor: usize,
}
impl<const N: usize> Buf<N> {
const fn new() -> Self {
Self {
bytes: [0u8; N],
cursor: 0,
}
}
const fn push_u32_slice(&mut self, slice: &[u32]) {
if self.bytes.len() - self.cursor < slice.len() {
panic!("exceeded capacity");
}
let mut i = 0;
while i < slice.len() {
let array = slice[i].to_le_bytes();
let mut j = 0;
while j < slice.len() {
self.bytes[self.cursor + i] = array[j];
j += 1;
}
i += 1;
}
self.cursor += slice.len();
}
}
#[repr(C)]
struct Frame {
a: u16,
b: u8,
}
assert_eq_size!(Frame, u32);
impl Frame {
const fn cast_slice(slice: &[Frame]) -> &[u32] {
// SAFETY We know the slice is valid and casting to bytes should
// always be valid, even if repr(rust) isn't stable yet.
unsafe { std::mem::transmute(slice) }
}
}
const FRAMES: &[Frame] = &[
Frame {
a: 8,
b: 1,
},
Frame {
a: 0,
b: 0,
},
];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
let mut buf = Buf::<NB_BYTES>::new();
let bytes = Frame::cast_slice(FRAMES);
buf.push_u32_slice(&bytes);
buf.bytes
}; Output
|
So this is not specific to enums, any value with padding that is transmutted will cause this issue, because padding in general is considered uninitialized by the compiler, at least at compile time. |
@clubby789 So casting c structs with padding to bytes in rust is UB? How does that make any sense. |
If that's so, that's kind of dumb, but it can be circumvented by manually specifying the padding bytes in the types. |
If the layout doesn't match then yes. That's what |
@clubby789 Ok, but is this a compiler limitation, or is this the current policy moving forward? Because the current approach sounds arbitrary, as if we can't trust the compiler with padding. For instance: // this is UB to cast
#[repr(C)]
struct Value1 {
a: u16,
b: u8,
}
// this isn't UB because we manually specified the padding
#[repr(C)]
struct Value2 {
a: u16,
b: u8,
_pad: u8,
} I'm fine either way, I'm just curious. |
I would have expected |
I don't know why you'd think that. It's also UB to read padding in C++, as you can see with the C++ example in https://cpp.godbolt.org/z/6Tec45P86, where it's returning |
There's no compiler limitation here. The current compiler behavior has been unchanged for a long time, probably forever.
Whether the transmutes you are trying to do are themselves UB is rust-lang/unsafe-code-guidelines#412. For sure attempting read a |
In c++ you have multiple methods of initialization. If you zero-initialize, the padding is guaranteed to be zero, and it isn't UB to read it.
In rust its UB to read padding even if you zero-initialize with mem::zeroed. |
I don't see how the behavior's age has anything to do with the current issue.
I understand that the issue is do to the compiler reading the individual padding byte. If a memcopy was used, this issue would not occur. |
So If i understand correctly, in rust every value is believed to be initialized with uninitialized padding and there is no way to zero-initialize in rust (or at least make the compiler believe that the padding is zero-init). Because of this casting structs with padding, no matter their representation, is always UB. The solution is to always explicitly specify padding if you want zero-initialization and in that case casting is not UB. |
From my perspective, it is a compiler limitation that even if you mem::zeroed a c struct, the compiler will act like the padding might be uninitialized. |
This statement is at best misleading. When the value is returned from
Yes,
You keep saying "casting structs" which is not a thing. I linked to a UCG issue which has more information about the reference-to-reference transmutes you are trying to do. I'm closing this issue, because there's no bug here. |
Code#[repr(C)]
struct Value {
a: u16,
b: u8,
}
const VALUE: Value = unsafe { std::mem::zeroed() };
const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) }; Output
|
Then how can you possibly zero initialize anything in a constant. Event using Code#![feature(const_ptr_write)]
#[repr(C)]
struct Value {
a: u16,
b: u8,
}
const VALUE: Value = {
let mut value: Value = unsafe { std::mem::zeroed() };
let ptr = &mut value as *mut Value as *mut u8;
unsafe { ptr.write_bytes(0u8, 4) };
value
};
const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) }; Output
Is zero-initialization impossible in rust on a constant? |
We're talking about c structs, and |
TL;DR It isn't for types containing padding, and your code would still be UB if it was possible. Your Thus, while Even if Rust did let you zero-initialize a constant, however, that would be useless: Calling
Is false. There is no Rust equivalent of C-style casting. There's a reason the people you were talking to were pedantic about the terminology. Also, as an FYI, Unsafe Rust is not C, and trying to write it as if it was C is a great way to speedrun UB. (In fact, I'm pretty sure literally just naming The only way in which you can ever read padding without it being immediate UB is if you've written to that padding and then made sure not to move the value. This can for example happen if you Note that this rule doesn't exist to give you an escape hatch where you can just cast stuff to raw bytes. It exists because it follows from other rules. You are simply not supposed to try to read stuff as raw bytes. As for your perspective of it being a compiler limitation, I would like to point out that I would consider a compiler that is always forced to copy/move structs by padding-preserving |
@T-Dark0 Thanks for the explanation. This is unintuitive, but I can understand the reasoning.
Yeah I guess its not a runtime cast. I know that rust isn't C, but when you do weird stuff that isn't supported there's no way around a bunch of unsafe code and potential UB.
In my case I was talking more in the context of constants, where the compiler knows the provenance of the bytes at compile time and could check that its indeed zeroed. Of course its a very niche use case. |
For some reason the compiler treats some value cast to bytes via
mem::transmute
as uninitialized memory in a const function context.Here's a link to the repo with this code.
Meta
rustc --version --verbose
:Error output
Edit1: Updated example to cast to u32 before casting to bytes for clarity
The text was updated successfully, but these errors were encountered: