-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API access architecture #100
Comments
This transition will not happen in a single step, so we should file separate tickets to help with each step of the process. |
Interestingly, GraphQL's best practices recommend not versioning. Three reasons that bring up versioning:
|
Further discussion that recommends an add-only approach to APIs: graphql/graphql-spec#134 |
It appears we also get subscriptions for free with GraphQL. http://graphql.org/blog/subscriptions-in-graphql-and-relay/ |
Columns on database models need additional annotations beyond what the database provides (data type, nullable, etc). Specifically, these three:
Example: Lastuser's Since |
#150 describes a new workflow layer that's baked into the model. Since models are the principal objects passed around in business logic, it makes sense to host state management within the model. |
Coaster needs to provide the foundation for API-based access to HasGeek apps.
Our current approach has tight coupling between view functions and the rendered output, whether as HTML or JSON. It also assumes a front-end and back-end developed in sync with each other, in the same repository. These assumptions change when we have a single page application (SPA) that may be long lived in the browser, going out of sync with back-end deployments. It gets worse with native apps, which can be out of sync by weeks or months.
To decouple front-end and back-end, we need some changes:
Long lived endpoints that guarantee an API regardless of actual data model. This can be via three approaches:
Distinct, versioned URLs in the form
/api/<version>/<method>
where each version can have a distinct calling pattern.A REST API where the URL is the same across versions, and the version is selected via an explicit HTTP header. GitHub does this via the Accept header.
A hybrid model where some URLs are explicitly versioned and within each, further customisation is possible via the
Accept
header. Coaster'srender_with
facilitates this approach.As a necessary outcome of the previous, views are now wrappers around a lower layer that handles actual business logic. This is the workflow layer. Coaster provides a
docflow
module for this. However, Docflow's architecture hasn't been tested with a non-trivial app and could do with more attention. It is currently only used with Kharcha, which exposed some limitations. (Update: StateManager was introduced in LabeledEnum helper property to replace Docflow #150 and Docflow is now deprecated, pending removal once the module is moved to Kharcha.)The back-end can also be an API consumer, especially as we move to distributed data storage. Lastuser provides an OAuth-based permission request and grant workflow that allows one app to request access to resources hosted in another app. However, OAuth is limiting as it recognises the notion of a type of resource and an action on it, but not a specific resource. For example, rather than grant access to one specific jobpost in Hasjob, the user can only grant access to all jobposts in Hasjob. Google's libmacaroons provides a framework for addressing this, but we need to build a workflow around it.
Another concern with decoupled front-ends and back-ends is that a front-end may have a data requirement that the back-end API does not provide. We have seen this with Funnel's JSON API to the Android app, where the API is too limiting and results in unnecessarily verbose data transfer. Since the projects are separately maintained, having requirements synchronised is a challenge. One approach to this problem is to expose a query API and require the front-end to have an intimate knowledge of the required data model. Facebook's GraphQL is a viable candidate.
GraphQL introduces a new problem. If we link it to SQLAlchemy, we risk exposing sensitive data to a third party. This has been a known problem with Ruby on Rails and automatic form construction. In HasGeek apps we always wrap db model write access with a form. However, this is inadequate:
Read access isn't wrapped. A view can still accidentally expose data the caller isn't authorised to receive. Coaster's permission model (provided in
sqlalchemy.PermissionMixin
and enforced inviews.load_models
only checks for the authorisation to call a view. We do not have any mechanism to define what attributes a caller is authorised to receive. This weakness is visible in Funnel's JSON API, which has a bunch ofif
conditions to determine if data should be included in the results. We need an equivalent toPermissionMixin
that specifies the conditions for read and write access to attributes on the model.Forms are shallow, providing all or nothing access to each attribute of a db model. In the case of relationships, which represent nested attributes (the so called "document" model of document databases, rather than the flat row model of SQL databases), there is no established way to represent this data. JSON Models are an option, but will require explicit specification separate from the model, as we currently do for forms. This increases the effort required to spec out a new data model.
SQLAlchemy models are also our source of truth, superceding the actual backing database. When in doubt, we regenerate the database from the models and reimport data. Database migrations must always produce a result that matches the model definition. This unfortunately means that there is no schema versioning if we use GraphQL to directly expose SQLAlchemy. We have opposing constraints here that force a compromise layer:
The database can never be in two states. Database consistency cannot be compromised.
An API consumer can never be cut off without notice.
A wrapping layer is required. We can bake this into the SQLAlchemy models with attributes that wrap other attributes (using SQLAlchemy's own
synonym
andhybrid_property
features, perhaps), but this will add layers of cruft to the model. It'll help if we can separate these out and explicitly mark their reasoning and maintenance window.SQLAlchemy can't do cross-database relationships. For example, if we restrict the
User
,Organization
andTeam
models to Lastuser, only storing UUID foreign keys in other apps (thereby removing these tables in all apps), an attribute likeJobPost.user
(in Hasjob) cannot be populated by SQLAlchemy. We will need another layer that populates this via RPC to Lastuser.There may be a case for public vs private APIs, the latter restricted to a more tightly coordinated HasGeek team. A public API could be exposed over HTTPS while a private API requires local network access via AMQP, for example. While such private APIs will be more performant and have lower maintenance overheads, this approach has two consequences:
It's a break from our commitment (so far) to using the same APIs we expose to everyone else.
Our own native apps can't access the private API. They still need to go through the public API, which means the maintenance overheads remain. The private APIs only provide a performance boost for back-end data gathering.
Finally, when can APIs be retired? All API calls will need a logging mechanism to keep track of usage, and perhaps a rate limiter to prevent abuse. We could outsource the logger to Nginx, or we could have it as a decorator in the app, giving us more fine grained control over what data is logged.
Checklist:
The text was updated successfully, but these errors were encountered: