Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document ForeignConvertible #1073

Merged
merged 4 commits into from
Nov 22, 2023
Merged

Document ForeignConvertible #1073

merged 4 commits into from
Nov 22, 2023

Conversation

kyouko-taiga
Copy link
Contributor

@kyouko-taiga kyouko-taiga commented Oct 10, 2023

Fix #993

Only the commits from 0649023 are relevant to this PR. I don't know why the others show up.

@dabrahams
Copy link
Collaborator

dabrahams commented Oct 25, 2023

@kyouko-taiga on my machine git log origin/main..origin/foreign-convertible-docs shows that at least a7b0526 is also relevant

@dabrahams
Copy link
Collaborator

dabrahams commented Oct 25, 2023

graphing origin/main and origin/foreign-convertible-docs with gitk shows this:

image

I tried for about a half hour to describe a rule that would explain why 97b1ee9 is in the list but b2d35af is not, and failed. Then I decided to try this and my conclusion was that it probably has to do with what was in main at the moment the PR was created. So I bet if you delete this PR and re-create it, they'll go away. Yup. A less-destructive alternative, probably, is just to merge the current main into this branch.

@kyouko-taiga
Copy link
Contributor Author

A less-destructive alternative, probably, is just to merge the current main into this branch.

That worked. Thanks!

Copy link
Collaborator

@dabrahams dabrahams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get it, sorry. Please see comments inline.

@@ -1,13 +1,63 @@
/// A type that can be converted to and from an a foreign representation.
/// A type whose values can be converted to and from a representation suitable for crossing a
/// language boundary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand what that means. Why would any type's representation in Hylo be unsuitable for crossing a language boundary?

Copy link
Contributor Author

@kyouko-taiga kyouko-taiga Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're talking about C and taking "suitable" to just mean "I can read bytes of the thing", then indeed no type representation is unsuitable. With this narrow interpretation we could just send a pointer to the raw bytes of every Hylo object to the foreign language and let it deal with it.

If we're trying to describe more useful abstractions, then one may need a way to explain how one can look at a Hylo object through the lens of another language's type system. For example, take Union<Pointer<Int>, Pointer<Float64>>. The Hylo representation of this type will likely be a single 64-bit integer but it won't be very useful to say that it's just being presented as char[8] in C. You wouldn't know where the discriminator is, or even how to read the discriminator. We can standardize this information (though I'm not sure we'd want to) but that would still mean the foreign language has to do the work of reconstructing the abstraction.

So I would claim that Union<Pointer<Int>, Pointer<Float64>> is not suitable to be represented in C. What we want is a representation that already makes sense "as is" w.r.t. the abstraction that it represents. Builtin.i64 fits the definition because int64_t makes sense "as is" in C.

Copy link
Collaborator

@dabrahams dabrahams Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're trying to describe more useful abstractions, then one may need a way to explain how one can look at a Hylo object through the lens of another language's type system. For example, take Union<Pointer, Pointer>. The Hylo representation of this type will likely be a single 64-bit integer but it won't be very useful to say that it's just being presented as char[8] in C. You wouldn't know where the discriminator is, or even how to read the discriminator.

This is a fantastic example. Nothing in this API seems to me like it's going to help your C function with that problem. You need—at the very least—some C declarations. The most obvious, and the one I think you're aiming for is a type declaration, but it could be function declarations, see below.

So I would claim that Union<Pointer<Int>, Pointer<Float64>> is not suitable to be represented in C.

Disagreed.
We can certainly create a C struct type—containing a boolean discriminator and a union—that represents the same notional type, which would constitute, “representing Union<Pointer<Int>, Pointer<Float64>> in C.” I think you mean that the data layout is not suitable for C, but even the data layout can be suitable for C if you supply C with a set of functions for accessing the basis operations of the type. I don't believe the right answer for every Hylo type is to serialize/deserialize it into a different representation that is in some sense already known to the other language when passing it across an FFI boundary.

An experience of seamless interop is very nice, but there are lots of ways of approaching it. I think the above shows the framing you're using to describe the problem to me is incomplete or confused. Getting that right is a prerequisite to creating the right API and documenting it properly.

/// A function declaration with the `@ffi` attribute introduces a foreign function interface (FFI),
/// an entity whose implementation is defined externally, typically in a different programming
/// language. Because this other language may not understand the layout of Hylo types, some glue
/// code has to be written to adapt the representations of values crossing the language boundary.
Copy link
Collaborator

@dabrahams dabrahams Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's seems wrong to me. Any language with sufficiently low-level access to memory can decode any data layout.

I think you're trying to say something else entirely. Perhaps something like:

Foreign language X has a datatype Y that notionally corresponds to Hylo's datatype Z, but may not share the same layout. We want to be able to syntactically pass a Z directly to an X function fun f(_: Y), with an implicit conversion from Z to Y mediated by ForeignConvertible.
?

If that interpretation is roughly correct (and I am far from confident that it is), this is an unnecessary syntactic sugar feature because we could always explicitly convert every Z to a Y instead of having that conversion happen implicitly. Moreover, I'm somewhat concerned about what mischief may be hidden behind that implicit code. Does it amount to an implicitly-generated copy in some cases? Why wouldn't we want the Y to be a projection from the Z instead of a returned value?

The rest of the comment seems to be about details of the mechanism, but what remains unaddressed for me is the motivation for having this thing in the first place.

Copy link
Contributor Author

@kyouko-taiga kyouko-taiga Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's seems wrong to me. Any language with sufficiently low-level access to memory can decode any data layout.

As I said above, I don't think that's the right question to ask. You can read raw bytes with pretty much any language but a good FFI solution should at least help you doing it properly.

If that interpretation is roughly correct [...]

I think it is roughly correct. I would add that layout, if it means how fields are laid out in a struct, is not the only thing to take into account. For example, we also have to care about the way one may represent a union in the foreign language because it may not agree with Hylo's approach.

this is an unnecessary syntactic sugar feature because we could always explicitly convert every Z to a Y instead of having that conversion happen implicitly

Yes, except that it will make you expose Builtin to everyone wanting to use a C FFI like fdopen, and have a way to expose the built-in value wrapped in Int, Pointer, etc. In other words, what is today let stdout = fdopen(1, "w") would become let stdout = MemoryAddress(base: fdopen(1.value, "w".utf8.base)) (and it will get worse once utf8 is no longer just a pointer).

Of course you can wrap this boilerplate the standard library, e.g. LibC.fdopen, but that is still busy boilerplate that my trait can eliminate rather elegantly. Plus, the boilerplate would come back for any other FFI our users may want to use.

Think about the convenience of FFIs in Swift. You import a C header and you get a beautiful Swift function fdopen(_: Int, _: UnsafePointer<CChar>) -> UnsafePointer<FILE>. This convenience is entirely due to compiler magic, though. I want a similar feature in Hylo without hardcoding type translations in the compiler.

Does it amount to an implicitly-generated copy in some cases?

That is a valid concern. I thought about it and concluded that it was okay to "pay" for a copy when you use an FFI because you probably can't trust it to uphold the rights/duties of all Hylo's passing conventions anyway. So foreign functions (the actual ones, not the FFI generated around it) take everything with a sink convention.

We can revisit this choice later but since the only crossing types for now are built-in numerics and pointers, a copy is the best strategy anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, except that it will make you expose Builtin to everyone wanting to use a C FFI like fdopen,

I don't see any reason why exposing Builtin should be necessary, and what you say is that "in other words" shows no exposure of Builtin.

Think about the convenience of FFIs in Swift.

It's lovely.

If this is just about how to make something with the semantics of let stdout = fdopen(1, "w") pretty, I think there are lots of possible approaches, and I'm not at all convinced we have the right one here. And, related to this PR, we still don't have a successful description of the meaning of this protocol. There's a broad spectrum of possible FFI functions and I strongly doubt we have enough examples in front of us to design the right interface. Note that I did something like what you've written for Boost.Python years ago so I understand the logic that leads to a design like this, but that's a much more constrained problem in some ways than what this API purports to address—and even that has dimensions you haven't accounted for, like coupling of lifetimes when a C++ function returns a reference to a part of a parameter.

I thought about it and concluded that it was okay to "pay" for a copy when you use an FFI because you probably can't trust it to uphold the rights/duties of all Hylo's passing conventions anyway.

IMO you are talking about two orthogonal concerns (safety and performance), which in turn are orthogonal to my concern, which is about the hidden-ness of the copy.

I would much prefer to start out by having FFIs be uglier than we'd like, and to explore the use cases extensively, before we decide how they should be addressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you're pushing very hard against an implemented, tested, and used feature only because you don't understand how it works and/or because it doesn't solve all possible FFI problems we can think of. Yes, it doesn't address C++ lifetimes, or talk with GC of a JVM, or interacts with the reference counter of Swift runtime. Many of these problems are open research questions that I don't even try to tackle.

It's pretty clear to me what this trait does, how it works, and why/when/how I would declare additional conformances. I did my best job to describe it. I may have failed but that doesn't mean the trait is not useful. It currently does the job I want to be done: programmatically explain to the compiler how to translate Hylo.Int to the corresponding type in a C function.

I don't want to remove this trait unless we have an equally concise way to call an FFI that doesn't require hardcoding translations in the compiler. If you have a better approach that fits these constraints, I'm happy to merge a PR.

Copy link
Collaborator

@dabrahams dabrahams Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you're pushing very hard against an implemented, tested, and used feature only because you don't understand how it works.

That's a mischaracterization. I understand how it works perfectly. I'm pushing back because it adds complexity and you can't explain to me why it's needed and how it solves any real problems. IIRC another contributor expressed a similar concern in one of our meetings, but I forget who, so it's not just me. Ints and pointers don't need any special translation to work with C: the FFI C function, for which we are manually writing a Hylo declaration, can simply be declared to take an Int or Pointer. Your technique forces me to declare another type to translate my type into/out of (or could I actually translate the type to itself?), and to create a conformance, and it all seems like needless complication. What's the benefit?

You haven't explained how this is going to actually solve FFI problems for any types that you might want to be presented to C with a different layout, and we don't have any actual examples of those today that we can use to validate this abstraction. It seems to me that when we run across such a thing we will still need to declare a Hylo type that has the correct layout and define translations to and from that type, which can be as syntactically lightweight as .to_c and .to_hylo property accesses, which is the same amount of work as a trait conformance, generalizes well to multiple languages, doesn't create any implicit conversions, and works even if multiple C types need to map into the same Hylo type (that is likely to happen for C integer types). Moreover, such types are incredibly rare. They only occur when you have both an existing Hylo layout and a different existing C layout for a given notional type, both of which already in use in both languages . Otherwise whichever language is new to the notional type could simply consume the other language's layout.

Yes, when we can import C and C++ headers (both of which features are a long way off), we will want to automatically map the types declared in those languages into a type consumable by Hylo, and it's important to be able to do the layout-compatible mappings to types already in the standard library (e.g.Int-into-Int) automatically. But for that job, the trait goes in the wrong direction: the compiler needs to look up a Hylo type based on the C type, not the other way around. With this trait you'd still need some way to relate C's int to Int.ForeignRepresentation, which is essentially the type translation Swift is hardcoding in the compiler, and that you want to avoid. Having this protocol doesn't avoid that AFAICT.

User- and library-defined types declared in C will need to be mapped into automatically synthesized (layout-compatible) Hylo counterparts from the imported C module. It's only the rare case where you have the same notional type with different layouts that this trait even becomes relevant as far as I can tell, and the idea that we ever want to do that layout translation silently is still extremely questionable.

I don't want to remove this trait unless we have an equally concise way to call an FFI that doesn't require hardcoding translations in the compiler. If you have a better approach that fits these constraints, I'm happy to merge a PR.

I'm pretty sure that as noted above this trait doesn't accomplish what you say w.r.t hardcoding. I'm happy to write a PR that removes all the mechanism, but I don't know how to solve that hardcoding problem today, and since we aren't ready to use a solution to it until we're importing C headers, I don't want to try. IMO interop beyond manual redeclaration of foreign functions is a complex problem that deserves more attention than we can give it right now.

@kyouko-taiga
Copy link
Contributor Author

kyouko-taiga commented Nov 22, 2023

Although it's been decided that we should do without ForeignConvertible or a similar trait until we can figure out a robust solution to C interop, this PR accurately describes the purpose and behavior of a functionality currently in use and so I think it should be merged.

An updated implementation of C interop should be submitted as a separate PR.

@kyouko-taiga kyouko-taiga merged commit 1d211bb into main Nov 22, 2023
10 checks passed
@kyouko-taiga kyouko-taiga deleted the foreign-convertible-docs branch November 22, 2023 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better document ForeignConvertible
2 participants