Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] [proposal] Safe Pointer trait #3728

Closed

Conversation

martinvuyk
Copy link
Contributor

@martinvuyk martinvuyk commented Oct 31, 2024

Losing fear of pointers

This would no longer be a cause of any fear, unsafety, or memory leaks

fn some_function[P1: SafePointer, P2: SafePointer](owned p1: P1, owned p2: P2):
    print(p1[0], p2[0])

fn main():
    p1 = FlexiblePointer[String].alloc(1)
    p2 = OwnedPointer[String].alloc(1)
    # p1[0] = "!" # this would abort at runtime since it is not initialized
    memset_zero(p1) # uses .unsafe_ptr() and sets is_initialized to True
    memset_zero(p2)
    p1[0], p2[0] = "!", "!"
    # A function can take them as owned and they get auto ASAP freed since
    # they own their data
    some_function(p1, p2)

I would not try to sell this as "fearless pointers" since there are many ways
one can make mistakes here. But it is a lot safer than UnsafePointer.

Signed-off-by: martinvuyk <[email protected]>
Signed-off-by: martinvuyk <[email protected]>
@owenhilyard
Copy link

For an allocator API, I'd prefer we take inspiration from hwloc, so we can handle requests like "allocate pinned interweaved memory from NUMA nodes 3 and 4 on a hugepage and DMA map it for GPUs 3,4 and 5". We can also probably take inspiration from the DPDK memzone API.

Allocation is actually kind of hard, and I think Zig was better than most, but missed a lot of important things.

@martinvuyk
Copy link
Contributor Author

Hi @owenhilyard as usual I had to do some homework to even answer you 😆, but I don't think implementing this proposal would be mutually exclusive to those use cases. The main goal here is to abstract stack, heap, arena, etc. pointers and build smarter ones on top. And this is specifically intended for collection types, be it from stdlib or an external lib, so that we don't need to differentiate between weak and strong pointers (and the method of allocation) in the type system itself, which allows them to interact with each other since they'll be considered the same type.

We'd have to actually try to implement the following to be sure, but I think it's possible:
Say we provide APIs for more sophisticated allocators than the current aligned malloc, you then allocate X amount of memory in Y complex configuration like what you described. I think there might be a way to wrap that in this same "safe pointer" API where you parametrize your pointer to allocate in that complex configuration manner.

If there is no way to make that specific configuration with the proposed API, and there is some serious pressure to allow it, we could try to extend it to allow passing some sort of generic configuration struct (having the safe pointer trait itself be parametrized on the allocator and config_type it uses). WDYT?

Allocation is actually kind of hard, and I think Zig was better than most, but missed a lot of important things.

Totally, and I think we'll need to make some compromises as well.

@owenhilyard
Copy link

If the goal is to build a Box or std::unique_ptr equivalent, why not use OwnedPointer to do that? You can wrap the inner struct with something which holds the "object" for the given allocator, be it either the global allocator (in which case it's empty because it's devirtualized) or some ultra-specific allocator. This would mean you have something like:

struct OwnedPointerInner[T: AnyType, Alloc: Allocator]:
    var item: T
    var allocator: Alloc

Then the allocator type can either be a smart pointer to an allocator (once we have a repr transparent equivalent), an empty type if we can fully devirtualize it, or something else. Once we have linear types, more exotic allocators can be modeled by destroying OwnedPointers passed to them, forcing you to free them. I'd love to hear from @VerdagonModular on whether using linear types to model objects that use custom allocators and ensure memory safety is a good idea. For things which look like malloc/calloc/realloc/free, we can probably keep the "automagical" experience, but I think that allocators which can't store all their state in a global somewhere tend to get messy if forced to look like malloc.

If we make the allocator an object which lives alongside the allocation, that should work fine, since each allocator can determine what metadata, if any, needs to live there. For hwloc, which is fairly advanced, you would just need allocation size and a pointer to the topology, which is trivial to store. Arena allocators and similar can use the later solution with linear types, since I've never written a program where it would be an actual issue to have linear types for an arena allocator.

@martinvuyk
Copy link
Contributor Author

martinvuyk commented Nov 20, 2024

If the goal is to build a Box or std::unique_ptr equivalent, why not use OwnedPointer to do that?

Because each of those examples is "safe" only in a given context, all pointers are. OwnedPointer is only useful when you want only 1 mutable reference at a time but unsafe in multi-thread contexts. The goal is to build a base layer of abstraction on top of which to add different kinds of pointers.

  • Pointer is useful for pointing to things where there will be only 1 reference to it (internal default for basic collection types for example), so there is no need to pay the cost of ref counting (currently achieved by an UnsafePointer and that the data structure frees it in its destructor)
  • OwnedPointer would be practically obsolete since Pointer would be the same (except where you want compile time guarantees)
  • Span (I'd like to rename it Buffer) would be able to own its data and as such in cases where a Span is allocated it can free it's own data and avoid leaks like [stdlib] [NFC] Add docstring comment for Span slicing's allocation #3774. But not all Span should have an OwnedPointer, they should have the equivalent of a weak pointer. Both of those use cases are achieved by a Pointer with different flags at runtime.
  • RcPointer and ArcPointer would each have their own use cases
  • When wanting to build, for example, a String you could build it from a pointer which the String doesn't own, but you'd still like to mutate (Imagine for example a structure where you want to uppercase all data in-place before doing something else without needing any copies, or a syscall created buffer where the pointer is freed by the OS)

Benefits

  • Arena allocators would become a possibility
  • All collection types would use safe pointers underneath and can be tuned for specific use cases (can have an ArcPointer backed "read-only" List shared between processes for example)
  • If a pointer is a pointer regardless of allocators, we can delete all stack versions of the collection types or plug their specific methods with some kind of conditional conformance into the new more generic collection types.

Disadvantages

  • Pointers would now carry and compute a lot of individual metadata at runtime
  • A lot of compile time guarantees are thrown out the window
  • Struct padding might also play a negative role.

If we make the allocator an object which lives alongside the allocation, that should work fine, since each allocator can determine what metadata, if any, needs to live there. For hwloc, which is fairly advanced, you would just need allocation size and a pointer to the topology, which is trivial to store.

That is one of the points I made in the proposal, you can have your SafePointer have its own fields and logic in how it allocates, the example given was of the arena allocator backed pointer:

struct ColosseumPointer[
    is_mutable: Bool, //,
    type: AnyType,
    origin: Origin[is_mutable].type,
    address_space: AddressSpace = AddressSpace.GENERIC,
]:
    """Colosseum Pointer (Arena Owner Pointer) that deallocates the arena when
    deleted."""
    var _free_slots: UnsafePointer[Byte]
    """Bits indicating whether the slot is free."""
    var _len: Int
    """The amount of bits set in the _free_slots pointer."""
    alias _P = UnsafePointer[type, address_space]
    var _ptr: Self._P
    """The data."""
    alias _S = ArcPointer[UnsafePointer[OpaquePointer], origin, address_space]
    var _self_ptr: Self._S
    """A self pointer."""
    alias _G = GladiatorPointer[type, origin, address_space]
    ...

The _self_ptr field is just an experiment, it will probably end up very differently since I think the Origin system would already extend the lifetime of the arena.

CollosseumPointer is the one that owns and manages the memory, GladiatorPointers are just living inside the arena and alloc and dealloc as they are created and destroyed by just marking the _free_slots from the CollosseumPointer to which they belong.

All in all what I'm proposing is to hide the complexity of allocators to be the responsibility of the implementation of each kind of pointer. An arena pointer is similar to an RcPointer or an ArcPointer if you squint at it, but they allocate very differently.

Disadvantages

  • Pointers would now carry and compute a lot of individual metadata at runtime
  • A lot of compile time guarantees are thrown out the window

I guess one of the main questions is this: how much do we want to fork the concepts into different types vs. unifying them inside fields at runtime?

Arena allocators and similar can use the later solution with linear types, since I've never written a program where it would be an actual issue to have linear types for an arena allocator.

Yes I was very intrigued with the possibilities after watching the community meeting.


A bit of a side-note, I purposely left out C++ terminology like smart pointers, this is an approach where Pointers are now also responsible for their allocator logic. Where pointers are safe in their promised use case. And not using words like "shared" where it can only be shared in certain ways (Arc pointers don't give you Mutex safety for example).

@owenhilyard
Copy link

If the goal is to build a Box or std::unique_ptr equivalent, why not use OwnedPointer to do that?

Because each of those examples is "safe" only in a given context, all pointers are. OwnedPointer is only useful when you want only 1 mutable reference at a time but unsafe in multi-thread contexts. The goal is to build a base layer of abstraction on top of which to add different kinds of pointers.

We have a borrow checker, so OwnedPointer is thread safe since a mutable borrow of the internal item mutably borrows the OwnedPointer.

* `Pointer` is useful for pointing to things where there will be only 1 reference to it (internal default for basic collection types for example), so there is no need to pay the cost of ref counting (currently achieved by an `UnsafePointer` and that the data structure frees it in its destructor)

OwnedPointer doesn't do ref counting. My understanding is that Pointer[T] is effectively a ref [_] T that you can store in structs. For arena allocators which aren't leaked, having a bounded lifetime is necessary, but it introduces a lot of headaches for allocators like malloc alternatives or for arena allocators which are leaked. I often allocate ~16 GB of arena allocators, store pointers to the arenas at well-known addresses and then leak them. I want ownership semantics so that destructors run for objects and objects are constructed to a known state.

* `OwnedPointer` would be practically obsolete since `Pointer` would be the same (except where you want compile time guarantees)

Except that Pointer is twice as large than OwnedPointer if both are backed by the a global allocator like malloc (alignment), and Pointer can't be loaded into registers via SIMD without expensive scatter/gather.

* `Span` (I'd like to rename it `Buffer`) would be able to own its data and as such in cases where a `Span` is allocated it can free it's own data and avoid leaks like [[stdlib] [NFC] Add docstring comment for `Span` slicing's allocation #3774](https://github.com/modularml/mojo/pull/3774). But not all `Span` should have an `OwnedPointer`, they should have the equivalent of a weak pointer. Both of those use cases are achieved by a `Pointer` with different flags at runtime.

I've been treating Span like iovec, and I agree they can't all be owned, some need to be borrows. However, to make them compatible with efficient vectored IO it needs to exactly match the layout of iovec.

* When wanting to build, for example, a `String` you could build it from a pointer which the `String` doesn't own, but you'd still like to mutate (Imagine for example a structure where you want to uppercase all data in-place before doing something else without needing any copies, or a syscall created buffer where the pointer is freed by the OS)

So does this mean parameterizing String over the type of backing pointer? I think that could be interesting.

Benefits

* Arena allocators would become a possibility

* All collection types would use safe pointers underneath and can be tuned for specific use cases (can have an `ArcPointer` backed "read-only" `List` shared between processes for example)

I'm not sure we want to have all of these extra checks in the hot path of every std data structure.

* If a pointer is a pointer regardless of allocators, we can delete all stack versions of the collection types or plug their specific methods with some kind of conditional conformance into the new more generic collection types.

Part of the point of the stack versions is the cache friendliness. If I can store an entire collection inline, you don't have to go through a pointer to get at the data.

Disadvantages

* Pointers would now carry and compute a lot of individual metadata at runtime
* A lot of compile time guarantees are thrown out the window

This is a fairly large problem, for some types of programs the majority of the memory is used by pointers. Having to do branchy checks before every pointer deref is also not great. I'd prefer to leverage compile-time guarentees like OwnedPointer does to keep this zero-cost. Having a pointer type which essentially just stores things on the heap but otherwise has the same guarantees as a normal object (object is initialized, the object is properly aligned, etc) is very useful. If we know the type of the allocator and it's a malloc or tcmalloc-like allocator, we can even remove the allocator struct.

* Struct padding might also play a negative role.

It does, you have 7 bytes of padding per object if you make an array of your Pointer type, that's 43% wasted memory per allocation.

If we make the allocator an object which lives alongside the allocation, that should work fine, since each allocator can determine what metadata, if any, needs to live there. For hwloc, which is fairly advanced, you would just need allocation size and a pointer to the topology, which is trivial to store.

That is one of the points I made in the proposal, you can have your SafePointer have its own fields and logic in how it allocates, the example given was of the arena allocator backed pointer:

But what if I want to use linear types for my memory safety so I don't need to store a bunch of extra pointers in my objects? If I have a pool of 50 million objects allocated, adding 8 bytes to each costs 400 MB of memory, which would be enough to store another 3 million objects if we assume 128 byte objects. Memory costs on that level MUST be opt-in.

struct ColosseumPointer[
    is_mutable: Bool, //,
    type: AnyType,
    origin: Origin[is_mutable].type,
    address_space: AddressSpace = AddressSpace.GENERIC,
]:
    """Colosseum Pointer (Arena Owner Pointer) that deallocates the arena when
    deleted."""
    var _free_slots: UnsafePointer[Byte]
    """Bits indicating whether the slot is free."""
    var _len: Int
    """The amount of bits set in the _free_slots pointer."""
    alias _P = UnsafePointer[type, address_space]
    var _ptr: Self._P
    """The data."""
    alias _S = ArcPointer[UnsafePointer[OpaquePointer], origin, address_space]
    var _self_ptr: Self._S
    """A self pointer."""
    alias _G = GladiatorPointer[type, origin, address_space]
    ...

The _self_ptr field is just an experiment, it will probably end up very differently since I think the Origin system would already extend the lifetime of the arena.

CollosseumPointer is the one that owns and manages the memory, GladiatorPointers are just living inside the arena and alloc and dealloc as they are created and destroyed by just marking the _free_slots from the CollosseumPointer to which they belong.

All in all what I'm proposing is to hide the complexity of allocators to be the responsibility of the implementation of each kind of pointer. An arena pointer is similar to an RcPointer or an ArcPointer if you squint at it, but they allocate very differently.

I agree that we want pointers to be responsible for managing the allocation type, but I don't know if having their destructor be the way you free them is always the best option. There are substantial costs to that approach that linear types let us avoid if we are willing to have more manual but still safe object management. For usability, it's better, but if you start to scale up the problem by a few million objects you see substantial impacts to memory usage. I think that ARM would also see issues from not being able to do offsets as part of a load, meaning that GladiatorPointer would require 2 instructions to dereference each time. We also lose the ability to place the Colosseum into a global and ditch the need to store a pointer to it in each object or pass it around.

Disadvantages

  • Pointers would now carry and compute a lot of individual metadata at runtime
  • A lot of compile time guarantees are thrown out the window

I guess one of the main questions is this: how much do we want to fork the concepts into different types vs. unifying them inside fields at runtime?

I think we want to do as much as we possibly can at compile time. My thought is that mixing many objects allocated in different ways inside of a collection is somewhat rare, and someone who does that can either use trait objects (once they work), or create their own runtime solution. If a "best runtime solution" exists and is figured out by the community, we can talk about adding it to the standard library. I can't think a time where I had a large enough number of mixed pointer objects that cloning them all when putting them in a collection wasn't an option. Keep in mind that we still have references to unify everything, since all pointers should be able to produce immutable references, which covers logging and other common places where you might want to mix allocation types.

Arena allocators and similar can use the later solution with linear types, since I've never written a program where it would be an actual issue to have linear types for an arena allocator.

Yes I was very intrigued with the possibilities after watching the community meeting.

A bit of a side-note, I purposely left out C++ terminology like smart pointers, this is an approach where Pointers are now also responsible for their allocator logic. Where pointers are safe in their promised use case. And not using words like "shared" where it can only be shared in certain ways (Arc pointers don't give you Mutex safety for example).

Arc is only supposed to hand out immutable references, so there's no mutation and they are safe. If you want mutability from an Arc you do Arc[Mutex[T]], which provides Mutex safety using the Mutex.

@martinvuyk
Copy link
Contributor Author

martinvuyk commented Nov 21, 2024

both are backed by the a global allocator like malloc

There are no globals in Mojo, _malloc is just a function that gets executed by UnsafePointer.alloc(). So it's kind of already ready to switch to anything if we wanted to parametrize the allocator function.

Except that Pointer is twice as large than OwnedPointer

Yep, that's one of the choices people would have to make according to their use case

I've been treating Span like iovec, and I agree they can't all be owned, some need to be borrows. However, to make them compatible with efficient vectored IO it needs to exactly match the layout of iovec.

Yeah that would be useful, but I think you could achieve that by using a pointer which has no flags padding it.

So does this mean parameterizing String over the type of backing pointer? I think that could be interesting.

Yes my main goal with this proposal is defining a trait that all pointers (except UnsafePointer) can adhere to and as such we can switch them out according to the use case. For example, anything that we want to send over a network with writev, you unsafely take the unsafe_ptr() out of the safe pointer and setup your iovec-like struct with it.

I'm not sure we want to have all of these extra checks in the hot path of every std data structure.

Totally, it should only be the default where it makes sense for the use case, not everywhere. We could still keep pointers which have clear ownership schemes and give comp time guarantees. That is the motivation for defining a trait and injecting the type of pointer that a type works with.

Part of the point of the stack versions is the cache friendliness. If I can store an entire collection inline, you don't have to go through a pointer to get at the data.

I'm not so sure of how the Mojo compiler internally handles stack allocated UnsafePointers. AFAIK they are just like C arrays.

This is a fairly large problem, for some types of programs the majority of the memory is used by pointers. Having to do branchy checks before every pointer deref is also not great.

Yes it's not good, we could leave them as debug_asserts 🤷‍♂️. But again, they would only be used where needed.

But what if I want to use linear types for my memory safety so I don't need to store a bunch of extra pointers in my objects?

Yes the current design kind of leaves linear pointers out. We could also make it so that the trait requires a linear sink function .free(owned self) instead of auto deletion.

GladiatorPointer would require 2 instructions to dereference each time. We also lose the ability to place the Colosseum into a global and ditch the need to store a pointer to it in each object or pass it around.

GladiatorPointer internally stores a pointer to the collosseum and its absolute offset from the start of the collosseum's data pointer. I don't see where that would lead to more than 1 deref. Storing the pointer in a global and using it would be a completely different model, a new pointer type for that kind of program could also be developed 🤷‍♂️ anyway, it's just an experiment to play around with the idea.

@martinvuyk
Copy link
Contributor Author

@owenhilyard @JoeLoser

I'll backpedal the proposal a bit:

  • I'll rename what I was proposing as changes to Pointer to be FlexiblePointer since they are very distant from each other in true intent (and the padding + runtime branching cost). Pointer remains as the struct version of ref.
  • I'll split off everything that has to do with custom allocators into a future proposal and maybe mention FlexiblePointer and what it enables as an example motivation for defining a SafePointer trait.

@martinvuyk martinvuyk changed the title [stdlib] [proposal] Safe Pointer trait and custom allocators [stdlib] [proposal] Safe Pointer trait Nov 21, 2024
@owenhilyard
Copy link

both are backed by the a global allocator like malloc

There are no globals in Mojo, _malloc is just a function that gets executed by UnsafePointer.alloc(). So it's kind of already ready to switch to anything if we wanted to parametrize the allocator function.

struct _Global[

I'm talking about the libc malloc or the tcmalloc used in the Mojo runtime. I can extern_call malloc if I want to. I'm using "global allocator" to mean an allocator which can allocate arbitrary spans of bytes and which stores its allocation state in a global which it manages itself.

So does this mean parameterizing String over the type of backing pointer? I think that could be interesting.

Yes my main goal with this proposal is defining a trait that all pointers (except UnsafePointer) can adhere to and as such we can switch them out according to the use case. For example, anything that we want to send over a network with writev, you unsafely take the unsafe_ptr() out of the safe pointer and setup your iovec-like struct with it.

I think that we should leave a destructor off of it then, so we can use linear types and pointers. The most general version of pointer I can define is "thing for which __getitem__ with no arguments gives me a reference (of some kind) to an object". We also want to consider that arena allocates sometimes cannot allocate multiple contiguous objects. To me, this knocks the common API down, and some pointer types may use an arena allocator or use stack memory and not be able to allocate more. We can provide more traits to expand the API, but I think the base trait has to be very limited in scope. Ideally, I'd like something that we could pass a reference to and it would be satisfied, which might mean creating the __deref__ function I've mentioned before and using that instead of __getitem__.

I'm not sure we want to have all of these extra checks in the hot path of every std data structure.

Totally, it should only be the default where it makes sense for the use case, not everywhere. We could still keep pointers which have clear ownership schemes and give comp time guarantees. That is the motivation for defining a trait and injecting the type of pointer that a type works with.

Part of the point of the stack versions is the cache friendliness. If I can store an entire collection inline, you don't have to go through a pointer to get at the data.

I'm not so sure of how the Mojo compiler internally handles stack allocated UnsafePointers. AFAIK they are just like C arrays.

I thought you were referring to getting rid of InlineArray, which I can store on the stack if I want to, or use as part of another collection without indirection.

This is a fairly large problem, for some types of programs the majority of the memory is used by pointers. Having to do branchy checks before every pointer deref is also not great.

Yes it's not good, we could leave them as debug_asserts 🤷‍♂️. But again, they would only be used where needed.

This is why I want to move the ecosystem towards constructs which are free at runtime. We can eat small hits to compile time if it means that we don't have runtime costs that may enter hot loops. I think that making sure that people only go to pointer types with runtime costs if they have no compile-time option should be a priority.

But what if I want to use linear types for my memory safety so I don't need to store a bunch of extra pointers in my objects?

Yes the current design kind of leaves linear pointers out. We could also make it so that the trait requires a linear sink function .free(owned self) instead of auto deletion.

This is why I don't want freeing to be a part of the base trait. Having LinearPointer and DestructorPointer combination traits should work while still allowing for functions which just need to stash a pointer for a bit to be generic over both. I want to make sure that functions only require the functionality they use, even if it means a lot of traits.

GladiatorPointer would require 2 instructions to dereference each time. We also lose the ability to place the Colosseum into a global and ditch the need to store a pointer to it in each object or pass it around.

GladiatorPointer internally stores a pointer to the collosseum and its absolute offset from the start of the collosseum's data pointer. I don't see where that would lead to more than 1 deref. Storing the pointer in a global and using it would be a completely different model, a new pointer type for that kind of program could also be developed 🤷‍♂️ anyway, it's just an experiment to play around with the idea.

Given a g: GladiatorPointer[T] which you want to load a foo member from, you need to do g._colosseum[].offset(g._start)[].foo. You need both deref the Arc to get the UnsafePointer[Colosseum], then offset by the g._start + the offset of foo into T, then deref again.

I'll backpedal the proposal a bit:

I'll rename what I was proposing as changes to Pointer to be FlexiblePointer since they are very distant from each other in true intent (and the padding + runtime branching cost). Pointer remains as the struct version of ref. I'll split off everything that has to do with custom allocators into a future proposal and maybe mention FlexiblePointer and what it enables as an example motivation for defining a SafePointer trait.

Sorry if it feels like I'm arguing against all of your proposals. I think that adding a pointer with some runtime checks to be more flexible is a good idea, but the allocator API is really hard and is one of those things that is very hard to get right. Any language which didn't have several months of arguments over the API is going to make a lot of mistakes which I think they will regret. I think you're working in the right areas, but we're running into very hard language design problems and they deserve very careful consideration, as well as thinking about what language features would have major effects on the design of the API and whether we want those features in Mojo.

It's possible I'm letting perfect be the enemy of good enough, but I'm afraid that if we make a temporary solution we won't be able to retract it later if the ecosystem builds on top of it. We want to avoid another DTypePointer if we can.

@martinvuyk
Copy link
Contributor Author

I'm talking about the libc malloc or the tcmalloc used in the Mojo runtime. I can extern_call malloc if I want to. I'm using "global allocator" to mean an allocator which can allocate arbitrary spans of bytes and which stores its allocation state in a global which it manages itself.

Ok I thought keeping track of the allocs was the responsibility of the OS so I was thinking along the lines of "global allocator function that gets called whenever you call 'malloc' " (yes I'm very new to these concepts).

I thought you were referring to getting rid of InlineArray, which I can store on the stack if I want to, or use as part of another collection without indirection.

Yes I actually was, but if users like you want to store it inline 🤷‍♂️. But my main issue with it is how limited the API is and it needs to duplicate every method from List if we want to make it as ergonomic. And it's also happening for InlineString as well, it goes on and on for every heap collection type.

This is why I want to move the ecosystem towards constructs which are free at runtime. We can eat small hits to compile time if it means that we don't have runtime costs that may enter hot loops. I think that making sure that people only go to pointer types with runtime costs if they have no compile-time option should be a priority.

Agree

This is why I don't want freeing to be a part of the base trait. Having LinearPointer and DestructorPointer combination traits should work while still allowing for functions which just need to stash a pointer for a bit to be generic over both. I want to make sure that functions only require the functionality they use, even if it means a lot of traits.

Makes sense (I also agree with the goal of reduced traits and making composition of simple base traits the default), but how do you differentiate between "safe pointer behavior" and UnsafePointer ?

Sorry if it feels like I'm arguing against all of your proposals. I think that adding a pointer with some runtime checks to be more flexible is a good idea, but the allocator API is really hard and is one of those things that is very hard to get right. Any language which didn't have several months of arguments over the API is going to make a lot of mistakes which I think they will regret. I think you're working in the right areas, but we're running into very hard language design problems and they deserve very careful consideration, as well as thinking about what language features would have major effects on the design of the API and whether we want those features in Mojo.

It's possible I'm letting perfect be the enemy of good enough, but I'm afraid that if we make a temporary solution we won't be able to retract it later if the ecosystem builds on top of it. We want to avoid another DTypePointer if we can.

Don't worry I actually like this (much better than getting no answer for weeks). It's in discussions like this that better alternatives/good compromises are found IMO. This is an exploratory proposal and I have no problem if it gets rejected, I wanted to get the idea some exposition since I thought it's worth considering. And yes I agree that this will be fundamental infrastructure and it needs careful consideration.

@owenhilyard
Copy link

I'm talking about the libc malloc or the tcmalloc used in the Mojo runtime. I can extern_call malloc if I want to. I'm using "global allocator" to mean an allocator which can allocate arbitrary spans of bytes and which stores its allocation state in a global which it manages itself.

Ok I thought keeping track of the allocs was the responsibility of the OS so I was thinking along the lines of "global allocator function that gets called whenever you call 'malloc' " (yes I'm very new to these concepts).

It's technically allowed to be, but nobody does that except for RTOSes. Most OSes will only hand out 4k, 16k, 2M or 1G blocks to userspace programs, and then the allocator subdivides those blocks. If the OS had to keep track of every single allocation that would mean that the kernel would need to be aware of and track every single JS object in your browser. This is something far better managed from user space, both due to the extra information a program has making allocations easier to pool, the cost of system calls, and the need for things like valgrind and asan.

I thought you were referring to getting rid of InlineArray, which I can store on the stack if I want to, or use as part of another collection without indirection.

Yes I actually was, but if users like you want to store it inline 🤷‍♂️. But my main issue with it is how limited the API is and it needs to duplicate every method from List if we want to make it as ergonomic. And it's also happening for InlineString as well, it goes on and on for every heap collection type.

InlineArray[T, N] is effectively a Rust [T; N]. Most of the API limits are because the type system doesn't really let us abstract over types very well. If we go the Rust route of having very good iterators, we can build most of the collections APIs in terms of indexing and iterators, which makes it easier to maintain things like this at the cost of being a bit annoying for LLVM to handle. Depending on how exactly coroutines get implemented we might get to write things using generators, and if we properly inform LLVM about them or have dedicated lowerings in MLIR that should work well.

The ability to have fixed-sized collections of things is fairly important as a primitive, even if most users don't interact with it. For instance, you could implement a BTree node as a something like this:

struct BTreeNode[T: AnyType, W: Width]:
  var parent: UnsafePointer[Self]
  var left: UnsafePointer[Self]
  var right: UnsafePointer[Self]
  var size: UInt
  var items: InlineArray[T, W]

This makes your life a lot easier. Or, you can build a FixedVector which uses inline storage, and move size in there and let it manage that bit as a reusable zero-cost abstraction.

This is why I want to move the ecosystem towards constructs which are free at runtime. We can eat small hits to compile time if it means that we don't have runtime costs that may enter hot loops. I think that making sure that people only go to pointer types with runtime costs if they have no compile-time option should be a priority.

Agree

This is why I don't want freeing to be a part of the base trait. Having LinearPointer and DestructorPointer combination traits should work while still allowing for functions which just need to stash a pointer for a bit to be generic over both. I want to make sure that functions only require the functionality they use, even if it means a lot of traits.

Makes sense (I also agree with the goal of reduced traits and making composition of simple base traits the default), but how do you differentiate between "safe pointer behavior" and UnsafePointer ?

So, once we have trait composition (which is a fairly important part of an algebraic type system), you should be able to do something like:

trait Pointer[T: AnyType]:
  """A pointer-like thing which can be dereferenced into an immutable reference."""

  fn __deref__[origin: ImmutableOrigin](ref [origin] self) -> ref [origin] T:
    ...

# Probably just be a marker trait, making a unified constructor is hard.
trait AlwaysInitPointer[T: AnyType]:
  """A pointer which initializes its pointee before returning a reference to the user.""
  ...
  
trait OriginPointer[origin: Origin]:
  """A pointer which is confined to a particular origin."""
  ...

alias SafePointer[T, origin] = Pointer[T] + AlwaysInitPointer[T] +  OriginPointer[origin]

trait MutablePointer[T: AnyType]:
  """A pointer which can produce a mutable reference."""
  
  fn __deref__[is_mutable: Bool, //, origin: Origin[is_mutable]](ref [origin] self) -> ref [origin] T
    ...

alias SafeMutablePointer[T, origin] = SafePointer[T, origin] + MutablePointer[T]

trait DestructablePointer:
    """A pointer which can free itself."""
    
    fn __del__(owned self):
      ...
      
alias SafeDestructablePointer[T, origin] = SafePointer[T, origin] + DestrucatablePointer[T]
alias SafeMutableDestructablePointer[T, origin] = SafeDestructablePointer[T, origin] + MutablePointer[T]

# No idea what this looks like internally
trait LinearPointer:
   """A pointer that is a linear type."""
   ...

alias SafeLinearPointer[T, origin] = SafePointer[T, origin] + LinearPointer[T]
alias SafeMutableLinearPointer[T, origin] = SafeLinearPointer[T, origin] + MutablePointer[T]

We may have to add a few more mixins, but you can see that by heavily decomposing the functionality, you can easily express what you want. For instance, a linear type Pointer which isn't always initialized but can produce mutable values and is tied to an origin is Pointer[T] + OriginPointer[origin] + LinearPointer[T] + MutablePointer[T]. If you use a group of them a lot and the stdlib doesn't, you can make an alias and the compiler can easily check that it's an equivalence with or superset of other people's aliases. It also lets us give nice errors like FooPointer[T, origin] does not implement AlwaysInitPointer[T], so it is not a SafeMutableLinearPointer[T, origin]. We might want to take a second pass for whether want to guarentee that MutablePointer[T] can produce immutable references, or whether we want ImmutablePointer[T], MutablePointer[T], alias Pointer[T] = ImmutablePointer[T] + MutablePointer[T]. It may mean we want some kind of macros to help manage the combinatorics, but I think that if the compiler can check alias equivalence then we can just define the ones the stdlib uses a lot by hand and let others which commonly show up in the community move in as they are needed. If we encourage library authors to be very minimal about what they use (for instance a linter warning if they ask for a trait they don't need), that should help. I think a lot of libraries probably only need MutablePointer + T: ... unless they care about destroying things themselves.

I think that we probably want to avoid having UnsafePointer directly implement these interfaces, and instead encourage people to make their own "compile time only" wrappers for whatever behavior they guarantee.

Sorry if it feels like I'm arguing against all of your proposals. I think that adding a pointer with some runtime checks to be more flexible is a good idea, but the allocator API is really hard and is one of those things that is very hard to get right. Any language which didn't have several months of arguments over the API is going to make a lot of mistakes which I think they will regret. I think you're working in the right areas, but we're running into very hard language design problems and they deserve very careful consideration, as well as thinking about what language features would have major effects on the design of the API and whether we want those features in Mojo.
It's possible I'm letting perfect be the enemy of good enough, but I'm afraid that if we make a temporary solution we won't be able to retract it later if the ecosystem builds on top of it. We want to avoid another DTypePointer if we can.

Don't worry I actually like this (much better than getting no answer for weeks). It's in discussions like this that better alternatives/good compromises are found IMO. This is an exploratory proposal and I have no problem if it gets rejected, I wanted to get the idea some exposition since I thought it's worth considering. And yes I agree that this will be fundamental infrastructure and it needs careful consideration.

That's good to hear! I'm trying to push for solutions I think will be able to handle all of the weird things I've ever done or heard of someone doing (within reason), since I want to avoid Mojo 1.0 launching with new stability guarantees to find out that some community can't use a chunk of the standard library because we messed up the API. I know it's hubris to try to say we will get everything right on the first try, but we can at least aim for it and have an edition/epoch/etc mechanism to fall back on.

@martinvuyk
Copy link
Contributor Author

So, once we have trait composition

We kinda already have it (the union part). I think the biggest feature we need ASAP are parametrized traits.

trait Pointer[T: AnyType]:
  """A pointer-like thing which can be dereferenced into an immutable reference."""

  fn __deref__[origin: ImmutableOrigin](ref [origin] self) -> ref [origin] T:
    ...

# Probably just be a marker trait, making a unified constructor is hard.
trait AlwaysInitPointer[T: AnyType]:
  """A pointer which initializes its pointee before returning a reference to the user."""
  ...
  
trait OriginPointer[origin: Origin]:
  """A pointer which is confined to a particular origin."""
  ...

alias SafePointer[T, origin] = Pointer[T] + AlwaysInitPointer[T] +  OriginPointer[origin]

trait MutablePointer[T: AnyType]:
  """A pointer which can produce a mutable reference."""
  
  fn __deref__[is_mutable: Bool, //, origin: Origin[is_mutable]](ref [origin] self) -> ref [origin] T
    ...

alias SafeMutablePointer[T, origin] = SafePointer[T, origin] + MutablePointer[T]

trait DestructablePointer:
    """A pointer which can free itself."""
    
    fn __del__(owned self):
      ...
      
alias SafeDestructablePointer[T, origin] = SafePointer[T, origin] + DestrucatablePointer[T]
alias SafeMutableDestructablePointer[T, origin] = SafeDestructablePointer[T, origin] + MutablePointer[T]

# No idea what this looks like internally
trait LinearPointer:
   """A pointer that is a linear type."""
   ...

alias SafeLinearPointer[T, origin] = SafePointer[T, origin] + LinearPointer[T]
alias SafeMutableLinearPointer[T, origin] = SafeLinearPointer[T, origin] + MutablePointer[T]

This is the way to go. It's definitely too ambitious to try to unify all of this into a single API.

A bit on the keyword bikeshedding spirit 😆, I actually prefer & and | for this (#3252)

Thanks for taking the time to explain things to me 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion Need discussion in order to move forward
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants