-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Pointer metadata & VTable #2580
Conversation
A previous iteration of this RFC is also visible at #2579. It was based on @gankro’s proposal https://gist.github.com/Gankro/b053cb4d1cb3bcaec070de89734720f7 |
First of all, thanks for opening this rfc. It's the right way to fix the raw::TraitObject API, and is a big step toward custom DST. My only criticism is I think the Vtable type should be generic over the trait object type, as mentioned in the alternatives section. Having different I like the name |
cc @ubsan |
text/0000-ptr-meta.md
Outdated
/// | ||
/// [dst]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts | ||
#[lang = "pointee"] | ||
pub trait Pointee { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so... I'm assuming the compiler implements
default impl<T: ?Sized> Pointee for T {
type Metadata = &'static Vtable;
}
impl<T: Sized> Pointee for T {
type Metadata = ();
}
impl Pointee for str {
type Metadata = usize;
}
impl<T: Sized> Pointee for [T] {
type Metadata = usize;
}
Which means theoretically we could make Vtable
generic over T
allowing the drop_in_place
method to take a raw pointer with the correct pointee type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These impls would be accurate in current Rust, but what I had in mind instead was that the compiler would automatically generate impls, similar to what it does for the std::marker::Unsize
trait. As far as the standard library is concerned these impls would be "magic", not based on specialization.
Regardless, yes, making VTable
generic with a type parameter for the trait object type is possible.
text/0000-ptr-meta.md
Outdated
(Answer: they can use a different metadata type like `[&'static VTable; N]`.) | ||
|
||
`VTable` could be made generic with a type parameter for the trait object type that it describes. | ||
This would avoid forcing that the size, alignment, and destruction pointers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that avoid forcing this? Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without a type parameter, x.size()
with x: &'static VTable
necessarily executes the same code for any vtable. With a type parameter, x: &'static VTable<dyn Foo>
and x: &'static VTable<dyn Bar>
are different types and could execute different code. (For example, do table lookup with different offsets.) However, keeping the offset of size
the same within all vtables might be desirable regardless of this API.
text/0000-ptr-meta.md
Outdated
`VTable` could be made generic with a type parameter for the trait object type that it describes. | ||
This would avoid forcing that the size, alignment, and destruction pointers | ||
be in the same location (offset) for every vtables. | ||
But keeping them in the same location is probaly desirable anyway to keep code size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the end of a sentence?
text/0000-ptr-meta.md
Outdated
type Metadata; | ||
} | ||
|
||
/// Pointers to types implementing this trait alias are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the end of a sentence?
@mikeyhew I’ve very open to adding a type parameter to As to supporting super-fat pointers with multiple vtable pointers, as mentioned in the alternatives section I believe this design doesn’t prevent it. Types that don’t exist yet and are added to the language in the future (possibly custom DSTs) can have a different metadata type. For |
text/0000-ptr-meta.md
Outdated
pub unsafe fn drop_in_place(&self, data: *mut ()) { ... } | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No drawbacks section...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up short trying to think of a reason not to do this at all (as opposed to doing it differently). Suggestions welcome.
text/0000-ptr-meta.md
Outdated
and (hopefully) more compatible with future custom DSTs proposals, | ||
this RFC resolves the question of what happens | ||
if trait objects with super-fat pointers with multiple vtable pointers are ever added. | ||
(Answer: they can use a different metadata type like `[&'static VTable; N]`.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should then [&'static VTable; 1]
for dyn SomeTrait
be used to make that transition smoother and to fit better with const generics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make some sense if we were definitely gonna have super-fat pointers with multiple separate vtable pointers as fat pointer metadata. But if we don’t and end up with a different solution to upcasting, we’ll end up with a always-single-item arrays for no reason. This isn’t really the thread to get into that discussion, but my opinion is that super-fat pointer have a significant enough size cost that I’d much prefer a different solution.
Perhaps an alternative for this RFC, more neutral with respect super-fat pointers v.s. not, would be to have type Metadata = VTable<Self>;
for trait objects. (See other comments about VTable
’s possible type paramater.) With the pointer/reference indirection hidden away in private fields of the VTable
type, this design would be compatible with having VTable<dyn A + B>
contain two pointers in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there a use case for generic code that accepts any trait object with any number of vtable pointer but not other kinds of DSTs?
text/0000-ptr-meta.md
Outdated
and (hopefully) more compatible with future custom DSTs proposals, | ||
this RFC resolves the question of what happens | ||
if trait objects with super-fat pointers with multiple vtable pointers are ever added. | ||
(Answer: they can use a different metadata type like `[&'static VTable; N]`.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing the proposals in the right order? Shouldn't we focus on dealing with dyn A + B + C
, upcasting, and such things first? Also, cc #2035.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the design proposed here is compatible enough with various options for multi-traits trait objects and upcasting such that there isn’t a strong dependency, and we don’t need to block this RFC on everything else being settled.
text/0000-ptr-meta.md
Outdated
|
||
* The name of `Pointee`. [Internals thread #6663][6663] used `Referent`. | ||
|
||
* The location of `VTable`. Is another module more appropriate than `std::ptr`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and should it be called Dictionary
instead? ("type class dictionary")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big -1 to calling it Dictionary since this typically means a key-value map (example: C#, Swift, Python).
Furthermore, here in Rust the VTable
is implemented as an array of function pointers, not a HashMap (unlike e.g. Python where it is really implemented as a dict
), so calling it Dictionary obscures the alleged complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the implementation happened to use HashMap
, I’d prefer VTable
since it’s more descriptive of the role of this type. (As opposed to: dictionary of what?) I believe that vtable is a well-enough established term of art.
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Co-Authored-By: SimonSapin <[email protected]>
Regarding making struct VTable<Dyn> { … } … then it could be used with any type as a parameter. What does struct VTable<Dyn> where Dyn: ?Sized + std::marker::DynTrait { … } Do we want such a trait? |
Hmm, maybe we could get away with this? struct VTable<Dyn> where Dyn: ?Sized + Pointee<Metadata=Self> { … } |
Is there any way to ensure minimal compiler time is wasted performing monomorphization on useless vtable type params? If not, I would rather the API just be less safe. |
@gankro the word "useless" sounds a little strong there... the purpose it to make sure you can't use a vtable from a |
You're supposing a situation where I somehow am writing code with two vtable types floating around, in which case I have two data pointers, and nothing can stop me from swapping the data pointers, producing the exact same effect. |
@SimonSapin I created a crate https://crates.io/crates/rfc2580 packaging a slightly revised version of this RFC, and adapting the I like the clean Also, I must congratulate you on your |
Actually, it doesn't. Since the size of the meta-data is known at compile-time -- based on
This simplifies the code for
The only moment where the alignment of the value is needed again is in An example implementation can be found in the examples of https://crates.io/crates/rfc2580/0.2.0 . As a result, I once again raise the question of whether the Conservatively, I would advise not exposing them -- even if they can be implemented -- for the moment. |
Oh that middle pointer is a really cool trick. Congrats. I feel that On the other hand it’s less clear to me what not having While I tried to make this RFC not prevent future language extensions such as custom DSTs, to be honest I somewhat expect Rust to never have custom DSTs, or any DST much more complex than already exist. Rust already spent a lot "complexity budget" on other language features, and at some point pushing the type system further starts to have diminishing returns. |
I am mostly concerned about the current limitation that you cannot go from At the same time, it's been 5 years already so maybe the fact that it's not been tackled means that in practice it's not too much of a problem, and we shouldn't think too much.
I am not sure. As long as you can access the (full) object, you can use I'd imagine that in Rust you would either have:
And in neither case you'd need to obtain the alignment from just the metadata. I can imagine one just implementing a contiguous vector of (metadata + padding + data) where you'd need the alignment from metadata alone to find the data, but then you'd need a linear scan to get to the Nth item which isn't terribly efficient. Then again, with most of my experience coming from C++, I'm not used to being able to separate the v-ptr from its data, so maybe I am just not imaginative enough. In any case, thank you very much for working on this. It's necessary for all kinds of custom storage, so I'm really keen on toying with the result. |
Huzzah! The @rust-lang/lang and @rust-lang/libs teams have decided to accept this RFC. To track further discussion, subscribe to the tracking issue here: |
use std::marker::{PhantomData, Unsize}; | ||
use std::ptr::{self, DynMetadata}; | ||
|
||
trait DynTrait<Dyn> = Pointee<Metadata=DynMetadata<Dyn>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does Pointee
come from here? Is it assumed to be a part of the prelude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s no plan to have it in the prelude. This is a mistake in example code, but then it’s just an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, just figured I might as well leave a note about these nits while I was reading through the RFC :)
|
||
#[repr(C)] | ||
struct WithMeta<T: ?Sized> { | ||
vtable: DynMetadata, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't DynMetadata
need a generic parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I forget to update this example when making that change.
impl<Dyn: ?Sized + DynTrait> ThinBox<Dyn> { | ||
pub fn new_unsize<S>(value: S) -> Self where S: Unsize<Dyn> { | ||
let vtable = ptr::metadata(&value as &Dyn); | ||
let ptr = NonNull::from(Box::leak(Box::new(WithMeta { vtable, value }))).cast(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use Box::into_raw
instead? Box::leak
produces a &
, which it seems slightly weird to then pass to dealloc
in Drop
(which takes a *mut _
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Using NonNull::from
avoid the unwrap
with NonNull::new
or the unsafe
with NonNull::unchecked
.
leak
returns &mut
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, you're right, my bad! Given the &mut
, I think it's fine.
} | ||
|
||
// Similarly Deref | ||
impl<Dyn: ?Sized + DynTrait> DerefMut for ThinBox<Dyn> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above refers to Deref
, but this only shows DerefMut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "similarly" comment is a handwaving this away, leaving it as an exercise to the reader. The point of example code is to show the kind of thing that could be built on top of this RFC’s proposed APIs, not to provide a solid library you can actually copy-paste and use.
Anyway, this is a merged PR. If you feel strongly enough about fixing this example code feel free to submit another PR.
/// For statically-sized types (that implement the `Sized` traits) | ||
/// as well as for `extern` types, | ||
/// pointers are said to be “thin”: metadata is zero-sized and its type is `()`. | ||
/// | ||
/// Pointers to [dynamically-sized types][dst] are said to be “fat” | ||
/// and have non-zero-sized metadata: | ||
/// | ||
/// * For structs whose last field is a DST, metadata is the metadata for the last field | ||
/// * For the `str` type, metadata is the length in bytes as `usize` | ||
/// * For slice types like `[T]`, metadata is the length in items as `usize` | ||
/// * For trait objects like `dyn SomeTrait`, metadata is [`DynMetadata<Self>`][DynMetadata] | ||
/// (e.g. `DynMetadata<dyn SomeTrait>`). | ||
/// | ||
/// In the future, the Rust language may gain new kinds of types | ||
/// that have different pointer metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After experimenting with custom implementations of Box
, I think there is a strong case for having strongly typed meta-data for all kinds of pointers.
The pre-allocator representation of Box
is:
struct Box<T: ?Sized> { ptr: NonNull<T>, }
The post-allocator representation is very similar:
struct Box<T: ?Sized, A: Allocator = Global> {
allocator: A,
ptr: NonNull<T>,
}
Both automatically implements CoerceUnsized<Box<U>> where T: Unsize<U>
, and all is well.
If one wants to make Box generic over its storage, then the representation becomes:
pub struct RawBox<T: ?Sized + Pointee, S: SingleElementStorage> {
storage: S,
handle: S::Handle<T>,
}
If S::Handle<T> == NonNull<T>
, then Box is still coercible; however, in the case of inline storage, that is:
- neither possible: when the
Box
is moved, so is the storage, and therefore any pointer into the storage is invalidated. - nor desirable: in the case of inline storage, the pointer is redundant, wasting 8 bytes.
Hence, in the case of inline storage, S::Handle<T>
is best defined as <T as Pointee>::Metadata
.
In order to have Box<T> : CoerceUnsized<Box<U>> where T: Unsize<U>
:
- We need:
S::Handle<T>: CoerceUnsized<S::Handle<U>> where T: Unsize<U>
, - Which means:
<T as Pointee>::Metadata: CoerceUnsized<<U as Pointee>::Metadata>> where T: Unsize<U>
.
And of course, Box being coercible is very much desirable.
As a result, I believe a slight change of course is necessary:
- All metadata should be strongly typed -- be it
Metadata<dyn Debug>
,Metadata<[u8]>
orMetadata<[u8; 3]>
-- no more()
orusize
. - The compiler should automatically implement
Metadata<T>: CoerceUnsized<Metadata<U>> where T: Unsize<U>
.
I would note that having a single Metadata<T>
rather than SizedMetadata<T>
, SliceMetadata<[T]>
, DynMetadata<dyn T>
is not necessary, only the coercion is, and since the compiler is generating those, it's perfectly free to create them "cross type".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is merged. Consider sending further discussion to the tracking issue: rust-lang/rust#81513
(That said, I don’t see a point in having a Box
generic over storage. Especially Box<T, InlineStorage>
, isn’t it pretty much the same as T
?)
Implement RFC 2580: Pointer metadata & VTable RFC: rust-lang/rfcs#2580 ~~Before merging this PR:~~ * [x] Wait for the end of the RFC’s [FCP to merge](rust-lang/rfcs#2580 (comment)). * [x] Open a tracking issue: rust-lang#81513 * [x] Update `#[unstable]` attributes in the PR with the tracking issue number ---- This PR extends the language with a new lang item for the `Pointee` trait which is special-cased in trait resolution to implement it for all types. Even in generic contexts, parameters can be assumed to implement it without a corresponding bound. For this I mostly imitated what the compiler was already doing for the `DiscriminantKind` trait. I’m very unfamiliar with compiler internals, so careful review is appreciated. This PR also extends the standard library with new unstable APIs in `core::ptr` and `std::ptr`: ```rust pub trait Pointee { /// One of `()`, `usize`, or `DynMetadata<dyn SomeTrait>` type Metadata: Copy + Send + Sync + Ord + Hash + Unpin; } pub trait Thin = Pointee<Metadata = ()>; pub const fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata {} pub const fn from_raw_parts<T: ?Sized>(*const (), <T as Pointee>::Metadata) -> *const T {} pub const fn from_raw_parts_mut<T: ?Sized>(*mut (),<T as Pointee>::Metadata) -> *mut T {} impl<T: ?Sized> NonNull<T> { pub const fn from_raw_parts(NonNull<()>, <T as Pointee>::Metadata) -> NonNull<T> {} /// Convenience for `(ptr.cast(), metadata(ptr))` pub const fn to_raw_parts(self) -> (NonNull<()>, <T as Pointee>::Metadata) {} } impl<T: ?Sized> *const T { pub const fn to_raw_parts(self) -> (*const (), <T as Pointee>::Metadata) {} } impl<T: ?Sized> *mut T { pub const fn to_raw_parts(self) -> (*mut (), <T as Pointee>::Metadata) {} } /// `<dyn SomeTrait as Pointee>::Metadata == DynMetadata<dyn SomeTrait>` pub struct DynMetadata<Dyn: ?Sized> { // Private pointer to vtable } impl<Dyn: ?Sized> DynMetadata<Dyn> { pub fn size_of(self) -> usize {} pub fn align_of(self) -> usize {} pub fn layout(self) -> crate::alloc::Layout {} } unsafe impl<Dyn: ?Sized> Send for DynMetadata<Dyn> {} unsafe impl<Dyn: ?Sized> Sync for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Debug for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Unpin for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Copy for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Clone for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Eq for DynMetadata<Dyn> {} impl<Dyn: ?Sized> PartialEq for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Ord for DynMetadata<Dyn> {} impl<Dyn: ?Sized> PartialOrd for DynMetadata<Dyn> {} impl<Dyn: ?Sized> Hash for DynMetadata<Dyn> {} ``` API differences from the RFC, in areas noted as unresolved questions in the RFC: * Module-level functions instead of associated `from_raw_parts` functions on `*const T` and `*mut T`, following the precedent of `null`, `slice_from_raw_parts`, etc. * Added `to_raw_parts`
Is |
Are there plans for |
if Rust ever supports C++ virtual classes (imho something that would be useful for FFI), that won't work since the size of the type is indirectly stored in the pointee, the metadata would be a ZST. |
Unless ABI compatiblity is needed, wouldn't the type pointer be stored in the metadata, as is done for |
almost the whole point of Rust supporting C++ virtual classes is ABI compatibility -- needed for FFI, so |
Add generic APIs that allow manipulating the metadata of fat pointers:
This RFC does not propose a mechanism for defining custom dynamically-sized types, but tries to stay compatible with future proposals that do.
Rendered