Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Padding considered uninitialized data when type is transmuted in const fn #131569

Closed
jean-airoldie opened this issue Oct 11, 2024 · 29 comments
Closed
Labels
C-gub Category: the reverse of a compiler bug is generally UB

Comments

@jean-airoldie
Copy link

jean-airoldie commented Oct 11, 2024

For some reason the compiler treats some value cast to bytes via mem::transmute as uninitialized memory in a const function context.

use static_assertions::assert_eq_size;

struct Buf<const N: usize> {
    bytes: [u8; N],
    cursor: usize,
}

impl<const N: usize> Buf<N> {
    const fn new() -> Self {
        Self {
            bytes: [0u8; N],
            cursor: 0,
        }
    }

    const fn push_u32_slice(&mut self, slice: &[u32]) {
        if self.bytes.len() - self.cursor < slice.len() {
            panic!("exceeded capacity");
        }

        let mut i = 0;
        while i < slice.len() {
            let array = slice[i].to_le_bytes();
            let mut j = 0;
            while j < slice.len() {
                self.bytes[self.cursor + i] = array[j];
                j += 1;
            }
            i += 1;
        }
        self.cursor += slice.len();
    }
}

#[repr(u8)]
enum Frame {
    First(u16),
    Second,
}

assert_eq_size!(Frame, u32);

impl Frame {
    const fn cast_slice(slice: &[Frame]) -> &[u32] {
        // SAFETY We know the slice is valid and casting to bytes should
        // always be valid, even if repr(rust) isn't stable yet.
        unsafe { std::mem::transmute(slice) }
    }
}

const FRAMES: &[Frame] = &[Frame::First(8), Frame::Second];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
    let mut buf = Buf::<NB_BYTES>::new();
    let bytes = Frame::cast_slice(FRAMES);
    buf.push_u32_slice(&bytes);
    buf.bytes
};

Here's a link to the repo with this code.

Meta

rustc --version --verbose:

rustc 1.83.0-nightly (52fd99839 2024-10-10)
binary: rustc
commit-hash: 52fd9983996d9fcfb719749838336be66dee68f9
commit-date: 2024-10-10
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.1

Error output

error[E0080]: evaluation of constant value failed
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^ accessing memory based on pointer with alignment 2, but alignment 4 is required
   |
note: inside `Buf::<8>::push_u32_slice`
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^
note: inside `SERIALIZED_FRAMES`
  --> src/lib.rs:56:5
   |
56 |     buf.push_u32_slice(&bytes);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0080`.
error: could not compile `const_uninit_bug` (lib) due to 1 previous error

Edit1: Updated example to cast to u32 before casting to bytes for clarity

@jean-airoldie jean-airoldie added the C-bug Category: This is a bug. label Oct 11, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 11, 2024
@jean-airoldie
Copy link
Author

Interestingly, if i replace the slice cast & slice access by a ptr cast and ptr write i get this error instead:

#![feature(const_ptr_write)]

struct Buf<const N: usize> {
    bytes: [u8; N],
    cursor: usize,
}

impl<const N: usize> Buf<N> {
    const fn new() -> Self {
        Self {
            bytes: [0u8; N],
            cursor: 0,
        }
    }

    const fn push_frame_slice(&mut self, slice: &[Frame]) {
        if self.bytes.len() - self.cursor < slice.len() {
            panic!("exceeded capacity");
        }

        let mut i = 0;
        while i < slice.len() {
            let frame = slice[i];

            let ptr = unsafe { self.bytes.as_mut_ptr().add(self.cursor) as *mut Frame };
            unsafe { ptr.write(frame) };
            self.cursor += std::mem::size_of::<Frame>();

            i += 1;
        }
    }
}

#[derive(Copy, Clone)]
#[repr(u8)]
enum Frame {
    First(u16),
    Second,
}

const FRAMES: &[Frame] = &[Frame::First(8), Frame::Second];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
    let mut buf = Buf::<NB_BYTES>::new();
    buf.push_frame_slice(FRAMES);
    buf.bytes
};

Output

error[E0080]: it is undefined behavior to use this value
  --> src/lib.rs:43:1
   |
43 | const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value at [1]: encountered uninitialized memory, but expected an integer
   |
   = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.
   = note: the raw bytes of the constant (size: 8, align: 1) {
               00 __ 08 00 01 __ __ __                         │ .░...░░░
           }

For more information about this error, try `rustc --explain E0080`.

So in other words, the zeroed bytes that are not explicitely part of the enum as considered uninitialized, even if I use repr(C).

@saethlin saethlin added C-gub Category: the reverse of a compiler bug is generally UB and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 11, 2024
@saethlin
Copy link
Member

saethlin commented Oct 11, 2024

The diagnostic in your second comment is about the padding between the u8 discriminant and the u16 field in the enum. The diagram

00 __ 08 00 01 __ __ __ 

Is byte 0, for the Frame::First discriminant, an uninitialized byte because there is padding between the discriminant and the field, 08 00 for the u16 field, then 01 for the discriminant of the Frame::Second.

Your cast_slice returns a slice with a length that is too short. Transmute only does a typed copy of the source operand as the destination type. The length field in that slice is 2, so your push_u8_slice never even gets to the bytes that are uninitialized because of the layout of Frame::Second.

Is there official documentation that led you to conclude that repr(C) enums don't have padding? I'm asking in case we have some misleading language around, if we do we need to fix it.

@saethlin
Copy link
Member

There are two major crates for doing these kinds of reinterpreting reference transmutes, bytemuck and zerocopy. They are both designed to avoid problems like you're running into.

@jean-airoldie
Copy link
Author

Not that's not what i'm saying, I'm saying the padding is interpreted as uninitialized, as you can see in the second example. I don't understand why padding would be considered uninitialized. Moreover this only occurs at compile time. I'll update my example for clarity.

@jean-airoldie
Copy link
Author

I have updated the example.

@jean-airoldie
Copy link
Author

jean-airoldie commented Oct 11, 2024

There are two major crates for doing these kinds of reinterpreting reference transmutes, bytemuck and zerocopy. They are both designed to avoid problems like you're running into.

@saethlin Using bytemuck or zerocopy won't do anything because they also use mem:transmute. My initial example was misleading, I updated it, please have another look. My constraint is also compile time serialization, which is why the code i provided is shitty, because you can't simply copy_from_slice in compile time fns.

For instance bytemuck uses internal::cast, which is this method:

/// Cast `A` into `B`
///
/// ## Panics
///
/// * This is like [`try_cast`](try_cast), but will panic on a size mismatch.
#[inline]
pub(crate) unsafe fn cast<A: Copy, B: Copy>(a: A) -> B {
  if size_of::<A>() == size_of::<B>() {
    unsafe { transmute!(a) }
  } else {
    something_went_wrong("cast", PodCastError::SizeMismatch)
  }
}

Aka transmute, so there's no way around it.

@jean-airoldie
Copy link
Author

The diagnostic in your second comment is about the padding between the u8 discriminant and the u16 field in the enum. The diagram

00 __ 08 00 01 __ __ __ 

Is byte 0, for the Frame::First discriminant, an uninitialized byte because there is padding between the discriminant and the field, 08 00 for the u16 field, then 01 for the discriminant of the Frame::Second.

Your cast_slice returns a slice with a length that is too short. Transmute only does a typed copy of the source operand as the destination type. The length field in that slice is 2, so your push_u8_slice never even gets to the bytes that are uninitialized because of the layout of Frame::Second.

Is there official documentation that led you to conclude that repr(C) enums don't have padding? I'm asking in case we have some misleading language around, if we do we need to fix it.

Padding should be zeroes, not uninitialized.

@jean-airoldie
Copy link
Author

The issue isn't that there is padding, its that the padding is considered uninitialized in constant functions for some reason.

@jean-airoldie
Copy link
Author

jean-airoldie commented Oct 11, 2024

Here's the same example with an union instead of an enum:

Code

use static_assertions::assert_eq_size;

struct Buf<const N: usize> {
    bytes: [u8; N],
    cursor: usize,
}

impl<const N: usize> Buf<N> {
    const fn new() -> Self {
        Self {
            bytes: [0u8; N],
            cursor: 0,
        }
    }

    const fn push_u32_slice(&mut self, slice: &[u32]) {
        if self.bytes.len() - self.cursor < slice.len() {
            panic!("exceeded capacity");
        }

        let mut i = 0;
        while i < slice.len() {
            let array = slice[i].to_le_bytes();
            let mut j = 0;
            while j < slice.len() {
                self.bytes[self.cursor + i] = array[j];
                j += 1;
            }
            i += 1;
        }
        self.cursor += slice.len();
    }
}

#[repr(C)]
struct Frame {
    kind: FrameKind,
    data: FrameData,
}

#[repr(u8)]
enum FrameKind {
    First,
    Second,
}

#[repr(C)]
union FrameData {
    a: u16,
    b: (),
}

assert_eq_size!(Frame, u32);

impl Frame {
    const fn cast_slice(slice: &[Frame]) -> &[u32] {
        // SAFETY We know the slice is valid and casting to bytes should
        // always be valid, even if repr(rust) isn't stable yet.
        unsafe { std::mem::transmute(slice) }
    }
}

const FRAMES: &[Frame] = &[
    Frame {
        kind: FrameKind::First,
        data: FrameData {
            a: 8,
        },
    },
    Frame {
        kind: FrameKind::Second,
        data: FrameData {
            b: (),
        },
    },
];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
    let mut buf = Buf::<NB_BYTES>::new();
    let bytes = Frame::cast_slice(FRAMES);
    buf.push_u32_slice(&bytes);
    buf.bytes
};

Output

error[E0080]: evaluation of constant value failed
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^ accessing memory based on pointer with alignment 2, but alignment 4 is required
   |
note: inside `Buf::<8>::push_u32_slice`
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^
note: inside `SERIALIZED_FRAMES`
  --> src/lib.rs:81:5
   |
81 |     buf.push_u32_slice(&bytes);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0080`.
error: could not compile `const_uninit_bug` (lib) due to 1 previous error

Same issue, where the padding in the union is considered uninitialized.

Edit1: Added collapsible code to reduce clutter

@jean-airoldie
Copy link
Author

And here's the same example using a struct with padding:

Code

use static_assertions::assert_eq_size;

struct Buf<const N: usize> {
    bytes: [u8; N],
    cursor: usize,
}

impl<const N: usize> Buf<N> {
    const fn new() -> Self {
        Self {
            bytes: [0u8; N],
            cursor: 0,
        }
    }

    const fn push_u32_slice(&mut self, slice: &[u32]) {
        if self.bytes.len() - self.cursor < slice.len() {
            panic!("exceeded capacity");
        }

        let mut i = 0;
        while i < slice.len() {
            let array = slice[i].to_le_bytes();
            let mut j = 0;
            while j < slice.len() {
                self.bytes[self.cursor + i] = array[j];
                j += 1;
            }
            i += 1;
        }
        self.cursor += slice.len();
    }
}

#[repr(C)]
struct Frame {
    a: u16,
    b: u8,
}

assert_eq_size!(Frame, u32);

impl Frame {
    const fn cast_slice(slice: &[Frame]) -> &[u32] {
        // SAFETY We know the slice is valid and casting to bytes should
        // always be valid, even if repr(rust) isn't stable yet.
        unsafe { std::mem::transmute(slice) }
    }
}

const FRAMES: &[Frame] = &[
    Frame {
        a: 8,
        b: 1,
    },
    Frame {
        a: 0,
        b: 0,
    },
];
const NB_BYTES: usize = FRAMES.len() * std::mem::size_of::<Frame>();
const SERIALIZED_FRAMES: [u8; NB_BYTES] = {
    let mut buf = Buf::<NB_BYTES>::new();
    let bytes = Frame::cast_slice(FRAMES);
    buf.push_u32_slice(&bytes);
    buf.bytes
};

Output

error[E0080]: evaluation of constant value failed
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^ accessing memory based on pointer with alignment 2, but alignment 4 is required
   |
note: inside `Buf::<8>::push_u32_slice`
  --> src/lib.rs:23:25
   |
23 |             let array = slice[i].to_le_bytes();
   |                         ^^^^^^^^
note: inside `SERIALIZED_FRAMES`
  --> src/lib.rs:65:5
   |
65 |     buf.push_u32_slice(&bytes);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^

For more information about this error, try `rustc --explain E0080`.
error: could not compile `const_uninit_bug` (lib) due to 1 previous error

@jean-airoldie
Copy link
Author

So this is not specific to enums, any value with padding that is transmutted will cause this issue, because padding in general is considered uninitialized by the compiler, at least at compile time.

@jean-airoldie jean-airoldie changed the title Enum cast to bytes via transmute is considered uninitialized data in const fn Padding considered uninitialized data when type is transmuted in const fn Oct 11, 2024
@clubby789
Copy link
Contributor

Padding is uninitialized

@jean-airoldie
Copy link
Author

@clubby789 So casting c structs with padding to bytes in rust is UB? How does that make any sense.

@jean-airoldie
Copy link
Author

jean-airoldie commented Oct 11, 2024

If that's so, that's kind of dumb, but it can be circumvented by manually specifying the padding bytes in the types.

@clubby789
Copy link
Contributor

If the layout doesn't match then yes. That's what repr(c) is for. bytemuck, for example, won't allow deriving Pod on types that have padding

@jean-airoldie
Copy link
Author

If the layout doesn't match then yes. That's what repr(c) is for. bytemuck, for example, won't allow deriving Pod on types that have padding

@clubby789 Ok, but is this a compiler limitation, or is this the current policy moving forward? Because the current approach sounds arbitrary, as if we can't trust the compiler with padding. For instance:

// this is UB to cast
#[repr(C)]
struct Value1 {
    a: u16,
    b: u8,
}

// this isn't UB because we manually specified the padding
#[repr(C)]
struct Value2 {
    a: u16,
    b: u8,
    _pad: u8,
}

I'm fine either way, I'm just curious.

@jean-airoldie
Copy link
Author

If the layout doesn't match then yes. That's what repr(c) is for. bytemuck, for example, won't allow deriving Pod on types that have padding

I would have expected repr(C) to never be UB when cast to bytes, so I'm not sure I fully understand. If it is UB to cast repr(C) types when they have padding, it sounds like it should be mentionned in the nomicon.

@scottmcm
Copy link
Member

I would have expected repr(C) to never be UB when cast to bytes

I don't know why you'd think that. It's also UB to read padding in C++, as you can see with the C++ example in https://cpp.godbolt.org/z/6Tec45P86, where it's returning undef in a function where the return type is marked noundef.

@saethlin
Copy link
Member

Ok, but is this a compiler limitation, or is this the current policy moving forward?

There's no compiler limitation here. The current compiler behavior has been unchanged for a long time, probably forever.

If it is UB to cast repr(C) types when they have padding

Whether the transmutes you are trying to do are themselves UB is rust-lang/unsafe-code-guidelines#412. For sure attempting read a u8 from an uninitialized byte is UB, which is why you get the specific compiler errors above; the const-eval interpreter doesn't do anything extra for the transmute but it does for the load of the u8.

@jean-airoldie
Copy link
Author

I would have expected repr(C) to never be UB when cast to bytes

I don't know why you'd think that. It's also UB to read padding in C++, as you can see with the C++ example in https://cpp.godbolt.org/z/6Tec45P86, where it's returning undef in a function where the return type is marked noundef.

In c++ you have multiple methods of initialization. If you zero-initialize, the padding is guaranteed to be zero, and it isn't UB to read it.

To zero-initialize an object or reference of type T means:
— if T is a scalar type (3.9), the object is set to the value 0 (zero), taken as an integral constant expression,
converted to T;
103
— if T is a (possibly cv-qualified) non-union class type, each non-static data member and each base-class
subobject is zero-initialized and padding is initialized to zero bits;
— if T is a (possibly cv-qualified) union type, the object’s first non-static named data member is zeroinitialized and padding is initialized to zero bits;
— if T is an array type, each element is zero-initialized;
— if T is a reference type, no initialization is performed.
(https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf page 201)

In rust its UB to read padding even if you zero-initialize with mem::zeroed.

@jean-airoldie
Copy link
Author

Ok, but is this a compiler limitation, or is this the current policy moving forward?

There's no compiler limitation here. The current compiler behavior has been unchanged for a long time, probably forever.

I don't see how the behavior's age has anything to do with the current issue.

Whether the transmutes you are trying to do are themselves UB is rust-lang/unsafe-code-guidelines#412. For sure attempting read a u8 from an uninitialized byte is UB, which is why you get the specific compiler errors above; the const-eval interpreter doesn't do anything extra for the transmute but it does for the load of the u8.

I understand that the issue is do to the compiler reading the individual padding byte. If a memcopy was used, this issue would not occur.

@jean-airoldie
Copy link
Author

So If i understand correctly, in rust every value is believed to be initialized with uninitialized padding and there is no way to zero-initialize in rust (or at least make the compiler believe that the padding is zero-init). Because of this casting structs with padding, no matter their representation, is always UB. The solution is to always explicitly specify padding if you want zero-initialization and in that case casting is not UB.

@jean-airoldie
Copy link
Author

From my perspective, it is a compiler limitation that even if you mem::zeroed a c struct, the compiler will act like the padding might be uninitialized.

@saethlin
Copy link
Member

saethlin commented Oct 12, 2024

In rust its UB to read padding even if you zero-initialize with mem::zeroed.

This statement is at best misleading. When the value is returned from mem::zeroed(), that is a typed copy of type T, which does not copy padding bytes. You can zero-initialize any memory with ptr::write_bytes, but if you then move that somewhere else as a T, bytes that are padding in T will again not be preserved. (also you can read padding by doing an untyped copy)

I understand that the issue is do to the compiler reading the individual padding byte. If a memcopy was used, this issue would not occur.

Yes, ptr::copy_nonoverlapping does an untyped copy.

Because of this casting structs with padding, no matter their representation, is always UB.

You keep saying "casting structs" which is not a thing. I linked to a UCG issue which has more information about the reference-to-reference transmutes you are trying to do.

I'm closing this issue, because there's no bug here.

@saethlin saethlin closed this as not planned Won't fix, can't repro, duplicate, stale Oct 12, 2024
@jean-airoldie
Copy link
Author

Code

#[repr(C)]
struct Value {
    a: u16,
    b: u8,
}

const VALUE: Value = unsafe { std::mem::zeroed() };
const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) };

Output

   Compiling playground v0.0.1 (/playground)
error[E0080]: it is undefined behavior to use this value
 --> src/lib.rs:9:1
  |
9 | const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) };
  | ^^^^^^^^^^^^^^^^^^^^ constructing invalid value at [3]: encountered uninitialized memory, but expected an integer
  |
  = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.
  = note: the raw bytes of the constant (size: 4, align: 1) {
              00 00 00 __                                     │ ...░
          }

For more information about this error, try `rustc --explain E0080`.
error: could not compile `playground` (lib) due to 1 previous error

@jean-airoldie
Copy link
Author

In rust its UB to read padding even if you zero-initialize with mem::zeroed.

This statement is at best misleading. When the value is returned from mem::zeroed(), that is a typed copy of type T, which does not copy padding bytes. You can zero-initialize any memory with ptr::write_bytes, but if you then move that somewhere else as a T, bytes that are padding in T will again not be preserved. (also you can read padding by doing an untyped copy)

Then how can you possibly zero initialize anything in a constant. Event using ptr::write_bytes results in the same error:

Code

#![feature(const_ptr_write)]

#[repr(C)]
struct Value {
    a: u16,
    b: u8,
}

const VALUE: Value = {
    let mut value: Value = unsafe { std::mem::zeroed() };
    let ptr = &mut value as *mut Value as *mut u8;
    unsafe { ptr.write_bytes(0u8, 4) };
    value
};
const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) };

Output

error[E0080]: it is undefined behavior to use this value
  --> src/lib.rs:15:1
   |
15 | const BYTES: [u8; 4] = unsafe { std::mem::transmute(VALUE) };
   | ^^^^^^^^^^^^^^^^^^^^ constructing invalid value at [3]: encountered uninitialized memory, but expected an integer
   |
   = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.
   = note: the raw bytes of the constant (size: 4, align: 1) {
               00 00 00 __                                     │ ...░
           }

For more information about this error, try `rustc --explain E0080`.
error: could not compile `playground` (lib) due to 1 previous error

Is zero-initialization impossible in rust on a constant?

@jean-airoldie
Copy link
Author

jean-airoldie commented Oct 12, 2024

You keep saying "casting structs" which is not a thing. I linked to a UCG issue which has more information about the reference-to-reference transmutes you are trying to do.

We're talking about c structs, and mem::transmute is the rust equivalent of c-style casting. But sure, i'll use your preferred word.

@T-Dark0
Copy link

T-Dark0 commented Oct 12, 2024

Is zero-initialization impossible in rust on a constant?

TL;DR It isn't for types containing padding, and your code would still be UB if it was possible.

Your value on the first line of the constant initializer is in fact zero-initialized. The problem is that moving a variable resets its padding to uninit (intuitively, because we want to be able to move structs containing significant amounts of padding by copying the Actual Data without bothering to copy the padding). This is also true of (default) copy and move constructors in C++: they copy/move each member variable individually, not the whole struct/class, thus any value written to locations that don't correspond to any member (such as padding) is lost.

Thus, while value is zero-initialized, VALUE, the const, is not: its fields are zeroed, but it was assigned by moving value into it, and that lost the padding. (Incidentally, a constant, at a conceptual level, is a value, not a block of memory. Padding isn't a fundamental property of being a value that is a struct, it's an accident of storing that value into memory. In the compiler's internal representation, consts fundamentally cannot have padding)

Even if Rust did let you zero-initialize a constant, however, that would be useless: Calling transmute(VALUE) involves moving VALUE inside the transmute function, which also resets its padding. Once again, this would also happen in C++. As a fun consequence:

mem::transmute is the rust equivalent of c-style casting

Is false. There is no Rust equivalent of C-style casting. There's a reason the people you were talking to were pedantic about the terminology. Also, as an FYI, Unsafe Rust is not C, and trying to write it as if it was C is a great way to speedrun UB.

(In fact, I'm pretty sure literally just naming VALUE moves out of it, if nothing else to move into a temporary you can take a reference to, so that would reset your padding as well, but I'm not certain of this claim)

The only way in which you can ever read padding without it being immediate UB is if you've written to that padding and then made sure not to move the value. This can for example happen if you write_bytes zeroes through a pointer to a struct, cast that pointer to a pointer to an array of u8, and read the whole thing.

Note that this rule doesn't exist to give you an escape hatch where you can just cast stuff to raw bytes. It exists because it follows from other rules. You are simply not supposed to try to read stuff as raw bytes.

As for your perspective of it being a compiler limitation, I would like to point out that I would consider a compiler that is always forced to copy/move structs by padding-preserving memcpy is significantly more limited than what Rust actually does. It would pessimize SoA transformations, SRoA optimizations, calling conventions that pass a struct in multiple registers, and generally emit less efficient code just to be able to support one tiny edge case that no code needs and very little code could even benefit from.

@jean-airoldie
Copy link
Author

@T-Dark0 Thanks for the explanation. This is unintuitive, but I can understand the reasoning.

Is false. There is no Rust equivalent of C-style casting. There's a reason the people you were talking to were pedantic about the terminology. Also, as an FYI, Unsafe Rust is not C, and trying to write it as if it was C is a great way to speedrun UB.

Yeah I guess its not a runtime cast. I know that rust isn't C, but when you do weird stuff that isn't supported there's no way around a bunch of unsafe code and potential UB.

As for your perspective of it being a compiler limitation [...]

In my case I was talking more in the context of constants, where the compiler knows the provenance of the bytes at compile time and could check that its indeed zeroed. Of course its a very niche use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-gub Category: the reverse of a compiler bug is generally UB
Projects
None yet
Development

No branches or pull requests

6 participants