Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Graph and the Query Node #463

Open
bedeho opened this issue Jan 23, 2020 · 1 comment
Open

The Graph and the Query Node #463

bedeho opened this issue Jan 23, 2020 · 1 comment
Labels

Comments

@bedeho
Copy link
Member

bedeho commented Jan 23, 2020

The Graph

Here is a summary of my understanding of The Graph, please correct any possible misunderstandings on my part:

The Graph is

  • a standard for specifying a GraphQL API, and associated WASM blockchain data processing routines called mappings, for maintaining the underlying data for this API.
    A particular instantiation of this two concepts is called a subgraph, hence there would for example eventually be a Joystream subgraph. Currently this standard
    only covers Ethereum.

  • a set of tools, the centrepiece of which is a Rust based API serving node, which can load a subgraph dynamically.
    Currently, this tooling only works for Ethereum.

  • a future network of node operators which will operate infrastructure for different subgraphs. The key goal here
    is to incentivise these operators to provide quality service at scale, and also to provide honest query results.
    How this is to happen is yet to be resolved. All current uses of The Graph rely on a trusted operator, e.g. such as the
    DApp developer.

Using The Graph for our query node

There is a good chance that The Graph, both as a standard and the tools, is coming to Substrate.
The timeline for when anything production ready would be available is however very uncertain.

There are a number of plausible benefits of relying on The Graph, rather than rolling our own full stack bespoke solution

  • Better tooling: They are writing a high performance query node, and have a large team (15+) working on improving and maintaining it, as well as substantial community buying, even at an early stage. Our own solution is entirely bespoke,
    and written largely in Python and Typescript, and has much less surrounding tooling and documentation.

  • Outsourcing unresolved hard problems: There are some important hard problems that need to be resolved, such as how to deal with in-flight runtime upgrades, or how to authenticate the responses of the query node. There is a much greater chance
    that The Graph will solve this problem better than us, and even if not, we have other areas of focus which are worth trading off against investing in the query node.

  • Follow a standard: It will be easier for new developers in the Substrate ecosystem to contribute and improve our query infrastructure, if it follows some familiar standard. If The Graph comes to Substrate, many will adopt it, and thus there
    will be a larger pool of trained developers who can improve the query node at a lower barrier to entry.

  • Free features: Things like filtering, sorting, pagination and in the future aggregate functions with grouping, are part of the well designed framework, you get them for free without any extra coding. We would have to replicate this in each query by hand, or
    at least replicate some reusable abstraction we can inject in our manually written query resolvers, such as The Graph has already done.

Impositions of The Graph

This is the current main design constraints we must respect in order to have our API and blockchain data processors maximally transferrable to a future Substrate The Graph.

  • Join free queries: The Graph requires that each query exactly one entity type at the data layer, and accepts no user defined type arguments, or allows the developer to write query resolvers.
    There is an automatic query resolver supplied which simply looks up across instances of the single entity type in the data layer. This means that if we have a desired query which needs to do an implicit join operation access to multiple different entities in the Substrate
    storage layer, then the entity type in The Graph be this join product itself. Critically, even with this, we cannot replicate any conceivable join query at this stage, because aggregators are not currently ready.

  • Pure mappings: It appears that The Graph allows you to write mappings that key off one of the following: contract calls, block arrival, contract event. This means that each one of these must contain all relevant information to perform the
    required mapping. E.g. if a particular event occurs, the event parameters defined by the contract author must have included all information that is needed for the query node mapping author to figure out what side-effect this event will
    have on the set of entities in the API. This is not the case for many events that we have currently defined in the Substrate runtime. This has so far not been a problem in our own bespoke node, because Substrate events exposed by the Harvester
    will include information about the initial call that was part of triggering it, and together this has always been sufficient.

  • No filtering, sorting, pagination: This is not really a requirement per say, its just that, if we try to add this by hand, we will be duplicating work we get for free. So perhaps the best approach is to only add this by hand if we
    absolutely need it for our UX in the interim.

  • Write mappings with Assemblyscript in mind: The Graph has tooling for compiling a subset of Typescript down to WASM. We should write our data processors in a way which has this in mind, by sticking as close as possible to the subset of Typescript
    available in Assemblyscript.

Risks

  • The Graph may never arrive for Substrate, and some of the constraints may have had some costs which will not then in the end made up for.

  • The Graph for Substrat may end up being materially different from the existing The Graph for Ethereum, in which case some of the listed constraints may be false, or there may be other new constraints we have not taken into account, which all conspire to raise the cost of the transition.

@bedeho
Copy link
Member Author

bedeho commented Mar 3, 2020

Clarification: It appears that event handlers do indeed allow you to recover the initial transaction, and corresponding payload, responsible for the event. This means that just processing events should be sufficient to construct state for any query we would like, so long as events supply all required information about side-effects. They need not copy over tx parameters.

@bedeho bedeho transferred this issue from another repository May 1, 2020
Lezek123 referenced this issue in Lezek123/substrate-runtime-joystream May 21, 2020
Add Signal, Evict Storage Provider, and Spending Proposal Forms
@bedeho bedeho transferred this issue from Joystream/joystream Nov 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants