Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider supporting package aliases / remapping #354

Open
thomashoneyman opened this issue Feb 16, 2022 · 6 comments
Open

Consider supporting package aliases / remapping #354

thomashoneyman opened this issue Feb 16, 2022 · 6 comments

Comments

@thomashoneyman
Copy link
Member

thomashoneyman commented Feb 16, 2022

Every entry in package sets produced by the registry uses the Address type, where a package set is a Map PackageName Address, ie. key/value pairs between package names and addresses:

https://github.com/purescript/registry/blob/561aa941c976d1fda6e9622a658394b5e2924d61/v1/Address.dhall#L11-L14

A registry-produced package set can only use the Registry constructor, so the package set is essentially a list like this:

{ package-name = Registry "1.0.0" }

This indicates that when package-name is listed as a dependency in a project, package managers should fetch package-name from the registry at version 1.0.0, read its dependencies, and fetch all those dependencies following the same process.

This works just fine in the usual case where you bring in packages from the registry, but there are some times where this becomes inconvenient. Two examples were brought up in the registry call this morning:

  1. You are developing an alternate backend like purerl and you need to use forked versions of common packages like prelude, because you need to re-implement the FFI.
  2. Your company is using the JS backend, but you are using a custom fork of prelude anyway, and you need for prelude in package sets to point to your forked prelude

How can users deal with these two situations? For example, if you are developing a purerl-compatible package set, where packages using the FFI are forked to their purerl- equivalents (ie. purerl-prelude replacing prelude), then what is the most straightforward way to produce this package set?


The suggestion that was raised in the registry call this morning was to introduce a name field to the Address type for the registry, ie.

 let Address = \(externalPackage : Type) -> 
         < Registry : { name : Text, version : SemVer }
         | External : externalPackage 
         > 

Then, users can remap package names, ie. the prelude could be mapped to purerl-prelude while other packages are left alone:

-- prelude is remapped
{ prelude = { name = "purerl-prelude", version = "1.0.0" }
-- bifunctors is not
, bifunctors = { name = "bifunctors", version = "1.0.0" }
}

When a package manager sees prelude as a dependency in your package, it is meant to fetch purerl-prelude instead. When it sees bifunctors as a dependency, it should fetch that package as usual, and when it sees in the bifunctors manifest that it in turn relies on prelude, the package manager should recall that prelude has been remapped to purerl-prelude and fetch (or cache) that.

@thomashoneyman
Copy link
Member Author

thomashoneyman commented Feb 16, 2022

Above, I've briefly described the suggestion I remember from the registry call -- @f-f, please correct me if I've gotten anything wrong! That said, I have a few reasons why I currently don't think this is a good idea (though I'm open to changing my mind, of course!).

Primarily, I just don't see how adding this remapping ability will really make the package sets any more usable for these two provided use cases. First, I'll describe why that is, and then I'll describe how you might do this without any changes to the registry package sets.

Will this make package sets more viable for alternate backends?

We're talking about changing the Registry constructor, so I'd like to note straightaway that the semantics of this constructor are "fetch the provided package version as a tarball from the registry storage." Any package using the constructor must be in the registry storage, and that means any package being remapped via the constructor must be also.

For example, imagine a purerl implementation of ordered-collections. This purerl-ordered-collections must be in the registry in order to remap ordered-collections to it. In turn, its dependencies must also be in the registry (ie. purerl-prelude must be registered). And of course there is no remapping in manifests; purerl-ordered-collections must depend on purerl-prelude, not on normal prelude. Here's an example manifest:

{ 
  "name": "purerl-ordered-collections",
  "dependencies": {
    "purerl-prelude": ">=1.0.0 <2.0.0",
    "purerl-arrays": ">=1.0.0 <2.0.0"
  }
}

Let's say we've done this: we registered all our purerl packages, and forked any package using the JS FFI so that it has a purerl-* counterpart that we also registered. Now, we want to make our own package set from the official package set by remapping package names.

Any time we find a package in the official set that used JS FFI, we remap it:

let upstream = ...

let overrides =
      { prelude = Registry { name = "purerl-prelude", version = upstream.version }
      , ...
      }

in upstream // overrides

I've set the stage, and now I must ask: what does this actually buy us?

From what I can tell, this buys us essentially nothing: any purerl- package must already point at its purerl- equivalent, because it had to do that in its manifest in order to be registered in the first place. It can't point at the usual prelude in its manifest because then any package manager using a solver instead of package sets will bring in the wrong package and fail to build. Therefore no purerl- package will be affected by remapping.

The only packages that would be affected by remapping are those which have no FFI whatsoever and which will work with any backend. But the purerl packages can point at those packages directly anyway! There's no need to remap them.

Unless I have missed something here, I am not understanding what remapping in the registry package sets can do for us, and so I don't think we should implement it.

Alternate solutions

Returning to the point of all this: how exactly can we ease the process of producing a new package set for these alternate backends or for folks who want to override prelude to point at their custom fork?

My proposal: punt to the package manager! That's what our parametric package sets are for!

If the purerl folks don't want to go through the hassle of registering every one of their forks with a modified dependencies list in the manifest, then they can keep their packages on GitHub and use the externalPkg constructor to add a case for package managers to fetch their packages from that location. The same goes for people who want to override the prelude to point at their custom fork: use the externalPkg constructor to override its location to be GitHub instead of the registry.

@thomashoneyman
Copy link
Member Author

I suppose I am misrepresenting things here, now that I've typed it out and hit the submit button (of course!).

It's true that you can register all the purerl packages, and have them point at one another, and remapping doesn't get you anything there. But the problem isn't the purerl packages and their dependencies, its the packages that depend on the purerl packages, like bifunctors.

The bifunctors package doesn't use FFI, but it points at the prelude. You want to be able to say it should point at the purerl-prelude instead. In the legacy package sets you would do this by overriding its dependencies, but that's not possible in the new package sets. Instead, you'd have to use the externalPkg function to add a constructor that lets you fetch bifunctors from the registry or GitHub and specify its dependencies or something like that. Very annoying, especially because you should be able to pull the package from the registry.

I still don't see why the Registry constructor is the place for this, though. This problem affects package managers using solvers, too. In that case you still want to be able to say that bifunctors should ignore its prelude dependency and use purerl-prelude instead.

And so I still think we should punt to package managers, and let package managers provide some mechanism for remapping package names -- something that works for both package sets and solvers. I don't exactly know what it should be, but possibly some kind of alias key could work?

{ name = "my-project"
, alias =
    { prelude = "purerl-prelude"
    , ...
    }
}

This way the manifests and the registry itself are untouched -- everything stays clean here -- but on the client side you can do remapping to make things convenient.

@f-f
Copy link
Member

f-f commented Feb 16, 2022

And of course there is no remapping in manifests; purerl-ordered-collections must depend on purerl-prelude, not on normal prelude.

Right, this is really the crux of the issue, and I missed that. Does this mean that purerl packages can't be on the registry as it is today and can only be used via package sets?
Forking all the packages (even if they don't have FFI) just to point them at the purerl equivalents is not a viable option, so I think we need to come up with some way to do such remapping in the manifest itself?

@thomashoneyman
Copy link
Member Author

thomashoneyman commented Feb 16, 2022

Some other problems:

  • The forked packages can't be in the package set because their module names will conflict
  • All registered packages are pushed to Pursuit, so we're going to end up with a bunch of duplicated code across different backends (ie. HeytingAlgebra from Prelude is going to show up from every prelude package, ditto for all alternate backend packages)

I'm dropping the "in package sets" from the issue title, because this affects more than just package sets.

@thomashoneyman thomashoneyman changed the title Consider supporting package aliases / remapping in package sets Consider supporting package aliases / remapping Feb 16, 2022
@colinwahl
Copy link
Collaborator

I've been putting quite a bit of thought into a namespacing vs aliasing approach to handling support for alternative backends, and although I initially saw a few advantages to using a namespaced approach, I was finding too many corner cases relating to syncing "pure" packages across backends & ensuring package interfaces match. Because of those issues, I think time would be better spent working through the aliasing solution. To get the conversation going, I put some more thought into how an aliasing solution would work:

Before thinking about this, I'd like to write down a few assumptions I am working off of:

  1. It is never valid to have mixed backends within a resolved dependency tree for a package
  2. Packages without any FFI are refered to as "pure" packages
  3. Packages with FFI need to specify a backend and are referred to as "impure"
  4. We keep a list of supported backends & their file extensions to identify if a manifest is "pure" or "impure"

I'd also like to write a few goals:

  1. Pursuit can be partitioned between backends (to avoid repeated definitions in packages implementing the prelude interface from cluttering search results)
  2. Avoid forking pure packages as much as possible for alternate backends

The Registry will support a list of known backends, perhaps in a format like the following. These will help us determine if a package is has FFI and if so, which backend it is targetting. I'll call the default JS backend "purs" for now.

{ backend: "purs", extension: ".js" }
{ backend: "purerl", extension: ".erl" }
{ backend: "purescm", extension: ".scm" }

To accomplish supporting alternate backends, I'd propose extending the existing Manifest type, with rules about what makes a new package valid.

newtype Manifest = Manifest
    { name :: PackageName
    , owners :: Maybe (NonEmptyArray Owner)
    , version :: Version
    , license :: License
    , location :: Location
    , description :: Maybe String
    , files :: Maybe (Array String)
    , dependencies :: Map PackageName Range
    
    -- New field:
    , backendMetadata :: Maybe BackendMetadata
    }
    
data BackendMetadata
  = PackageMetadata
      { backend :: String
      , aliases :: Map PackageName PackageName
      }
  | PackageAliasMetadata
      { backend :: String
      , aliases :: Map PackageName PackageName
      , revision :: Int
      , replaces :: PackageName
      }
      

Rules:

  1. "pure" packages have backendMetadata: Nothing, and the set of all their transitive dependencies must come from the same backend (not necessarily the default "purs" backend)
  2. "impure" packages which are meant as a replacement for a different package in another backend (e.g. purescript-prelude vs purerl-prelude) have backendMetadata: Just (PackageAliasMetadata ...). The revision field is there to support new releases outside of tracking the versions of the replaced package should bug fixes need to happen.
  3. Otherwise "impure" packages have backendMetadata: Just (PackageMetadata ...).
  4. In cases (2) and (3), you must provide aliases for each "impure" dependency in the set of possible (keep in mind dependencies specify version ranges) transitive dependencies that are targeting the same backend.
  5. Packages which are replacements in (2) must only release versions that have already been released by the replaced package, and their public APIs must be identical.

There are a few workflows I can see happening that I'd like to talk through the behavior of:

  1. What happens when someone uploads a package that is pure initially, but then adds FFI in a later version?
  2. What happens when a package initially has FFI but then is pure in later versions?
  3. How do we ensure aliased package interfaces are the same? Do we allow them to diverge in any special ways?
  4. How do we partition pursuit to display results for different backends?
  5. How do we insert package alias version revisions if the registry index is keyed on version?

I think I have some rough answers to a few of them:

  1. This is fine unless there it is the dependency of a package that is implemented for an alternate backend. In that case, this needs to be considered a breaking change & the package will need to be forked in order to support the alternate backend.
  2. This is fine.
  3. I'm not sure the best way to verify this - probably by checking externs somehow.
  4. This isn't as clear to me at this point. The trouble is we have to be able to upload docs for "pure" packages, but to do that we need to resolve their "impure" dependencies to be fulfilled by that backend.
  5. I'm not sure, we probably need to push the revision up & use it in the key.

@f-f / @thomashoneyman, I'd be happy to get your thoughts on this - does it seem to be a step in the right direction?

@f-f
Copy link
Member

f-f commented Mar 16, 2022

I'll briefly sketch a solution to the open questions:

  1. it's fine, and it could work as you describe
  2. yep, fine
  3. we could check the externs, and allow backend packages to have a "wider" interface than the alias root
  4. we'd either have separate instances of Pursuit, or have some switch to pick the right backend
  5. in the same vein, we could have a whole new registry index for each backend

However, I think the scope for this thing is too big right now to include it in the alpha, and we should move on with the alpha if we are confident that this won't heavily disrupt the current design (which I'm personally confident about)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants