-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Minimal Relay #32
Proposal for Minimal Relay #32
Conversation
merge upstream
Staking, and Governance as the systems to work on first. A brief discussion on the factors involved | ||
in each one: | ||
|
||
#### Identity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to confirm, we will have a system parachain just for identity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you considered to merge identity for Kusama and Polkadot? like we have a single fellowship for both networks. I see no reason to have two set of identities on each networks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable idea. The parachain is for Identity to get it off the Relay Chain. Of course with more advanced Coretime scheduling it's probably reasonable to have slower scheduling.
The reason to keep them separate for now is that Kusama is a nice dress rehearsal for Polkadot. But I have nothing against a second migration that deprecates the Kusama chain and does all identities through Polkadot's system.
|
||
## Unresolved Questions | ||
|
||
There remain some implementation questions, like how to use balances for both Staking and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need identity as its own chain. Then we have the question of how to determine if a feature should lives on its own parachain or coexisting with another parachain. e.g. we could have the identity pallet also lives on the governance parachain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current migration plan results downtime. i.e. we have to freeze identity state, setup new genesis, run the new chain, and then it is up again. This is likely going to take weeks. Can we do better to reduce the downtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you've said, a lot of the use of Identity is for reading. While there is downtime, it is only to updating identities and providing judgements. The service still exists for rendering identity info. We can try to coordinate upgrades/merging PRs, but IMO the complexity of a zero-downtime migration outweighs the benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need identity as its own chain. Then we have the question of how to determine if a feature should lives on its own parachain or coexisting with another parachain.
Also, deposits are much lower on parachains. Current identity with 21 DOT is very expensive for a lot of people. Having it on a parachain makes it much more accessible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we use the current collectives chain as a hub for everything that's social/offchain related?
I think it is best to do the simple things first: migrate identity and governance, and leave the tricky parts in a future RFC. Otherwise we won't be able to accept this RPC until all the hard questions are unanswered and it will unnecessary blocking the straightforward actions. |
The goal of this RFC is not to answer all the "hard questions" and implementation details. It is to state a direction and objective: Identity, Staking, and Governance off the Relay Chain. Of course, for each one there will be different "hard questions" and implementation challenges. Part of the process will be figuring them out, but it's better for parachain and UI developers to know now that the architecture of this state/logic that they interact with is being rebuilt. |
I guess we can signal the intention to move those features to system parachain. But without more feasibility study, we cannot say for sure we are going to move staking to a parachain. There is a good chance that it cannot happen without some major refactorings and I will say it is too early to make such decision at this stage. |
Well this is Polkadot eating its own dog food. Yes it might (probably will) take some big refactorings. But we should not say, "it's hard to do X on a parachain but we're special so let's just put it on the Relay", and then expect others to do complicated things in parachains. If it requires refactorings, let's do them. |
blockspace) to the network. | ||
|
||
By minimising state transition logic on the Relay Chain by migrating it into "system chains" -- a | ||
set of parachains that, with the Relay Chain, make up the Polkadot protocol -- the Polkadot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid mixing the general desire to take functionality off of the relay chain with an implementation which gives each function its own chain. We should consider offloading each of these components onto a single chain for better synchronous composability between them. Of all the components currently being bundled into system chains, the only one I see a strong case for having its own is staking, as nomination, elections, and slashing are quite heavy.
The underlying goals are to maximize secure blockspace and effectively use blockspace. Anything which requires less than a single full core will have to either use coretime less frequently (poor system UX) or will use the core constantly but far below its capabilities (resource misallocation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we place balances + staking + governance together on one parachain. We'd still handle era points on the relay chain perhaps, but not dots. We should figure out if this is possible though, aka how governance controls the relay chain, and how era points and slashing work.
Accumulate being quite flexible and permissionless is one of the two new ideas in corejam. Accumulate becomes much safer if accumulate cannot perform logic based upon dot balances, but only based upon authorization conditions set by the parachains themselves.
I'd envision nominations using a chain-forked pattern whatever we do, i.e. the staking chain spins off an nomination chain, to which validators add their opinions. This requires running parachain blocks longer.
The other of the two new ideas in corejam is to have blocks run a long time by saving their memory state. Although nice, we think this cannot really help with state migrations or npos, due to how memory accesses work. We can likely use other techniques to do longer running npos computations on parachains though, like simply allowing them to run longer.
Anyways, if we "eat our own dog food" by moving dots off the relay chain, then we make more time available for user defined accumulate functions, reduce the number of dumb things they do, and establish patterns for them to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that requires non-uniform work package execution times/weights seems impractical. It should be possible to extend computations over multiple blocks, by paging in the relevant data needed for each step of the algorithm at any point.
Yes, spinning off child chains/tasks for heavy computation such as elections or governance vote-tallying makes a lot of sense, especially since parachain execution is already stateless. The System product benefits a lot from having as many user-exposed functionalities as possible living in the same chain, especially when there is no clear impetus to scale horizontally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once two features are in the same chain, users and applications get very accustomed to their being in the same chain, which makes it difficult to split them into multiple chains. I understand that with today's usage, many of these subsystems could all go into a single system parachain. However, as usage and requirements increase (e.g. several thousand validators), it will hit a bottleneck and need to be split.
IMO it makes more sense to keep these subsystems in separate chains but take advantage of core scheduling to handle resource allocation. We could use one core to process staking, governance, and identity in round-robin fashion (perhaps prioritizing staking at certain times like session changes). You mention degraded system UX but we're still talking 18 second instead of 6 second block times. When a single core no longer meets the execution needs of a subsystem, it's much easier and more agile to allocate a dedicated core to that subsystem than it is to launch a new chain and migrate all the data/logic to it (plus getting parachains to re-address their XCM programs, applications to target new chains, etc.).
RE spinning off child chains for heavy computations, yes this is reasonable but I think it's an optimization. The first priority is to get this work off the Relay Chain. Then we can optimize with things like child chains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mention degraded system UX but we're still talking 18 second instead of 6 second block times
With the amount of traffic that an identity chain currently handles, it'd take weeks to make a full block. UX challenges for users coordinating across many chains notwithstanding, even wasting 1/3 of a core on identity is clearly trading off sharding + lowish latency over consolidation and low latency.
I see the case for being inefficient with coretime in the face of a coretime surplus, though I expect that the pressures under load would point back to consolidation and the practical experience for end-users of navigating many chains tips it in favor of consolidation now for me.
Full support of moving these pallets off of the relay chain.
|
||
However, state transitions on the Relay Chain need to be executed by _all_ validators. If any of | ||
those state transitions can occur on parachains, then the resources of the complement of a single | ||
backing group could be used to offer more cores. As in, they could be offering more coretime (a.k.a. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen this argument crop up but I will comment to lay out the counterarguments:
- This presumes that the resources spent on including user transactions in blocks can necessarily be spent on adding more cores.
- If (1) holds, this presumes that work done on cores will necessarily be more valuable than work done directly on the relay chain
re: (1) Current configurations specify resource requirements for cores (in data / compute) which are discrete and this implies that there is a high likelihood of some leftover resources which are sufficient to execute user transaction but insufficient to add another core. Furthermore, the actual data burden on the network can be estimated by the sizes of PoVs placed onto cores in recent blocks.
re: (2) Synchronous composability is often more valuable than asynchronous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some level, I do think cores could usefully consume otherwise spent on user transactions. We do not necessarily have a straight line there, but it's still all bandwidth, cpu time, and disk space. The question is more: How much resources really?
I think the synchronous vs asynchronous dichotomy winds up being largely illusory here. We do know some computations parallelize well and some badly, ala npos, but that's not really your dichotomy. All of web2 is built upon sharded data models aka what you call asynchronous. It's true that sharded data models require more thought, with this RFC being one example, but they're so dominant in web2 that they're probably not blocking adoption much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really cannot see any reasonable argument to intentionally put work on an obvious bottleneck point of the network which can be moved into a non-bottleneck point.
Synchronous composability specifically on the Relay-chain, whose entire point under CoreJam is to join async compute results and accumulate state machine transitions, does not sound sensible to me at all.
I've said it before and I'll say it again: The Relay-chain should do one thing and do it well. That means being the head of a secure decentralised multicore computer. Sticking an EVM smart contract in there and/or continuing to allow end-user-application-level transactions and general interaction goes directly against this mantra.
The point of the RFC is to get everyone on the same page regarding the future of the Relay-chain. This is nothing new and I gave several presentations mentioning all of this in 2019. |
I don't believe that's true at this stage. We want to make the cost to be small but unless someone make a proposal explain on how we can make it small in future, I won't accept it as an argument. |
Care to expand on why you believe exactly having 4 system chains rather than 3 has such a large additional cost? |
None of us can just state something without provide explanations so here are some areas to check for the cost of operate a new chain:
In this specific case, the difference between 3 and 4 chains may not be big. However if we choose one feature per system parachain, then we could have like 8 of them. So we should evaluate the difference between 3 and 8 to help guide the future decisions. I will say there are lot more work required to allow multichain dApps to scale. We should also ask wallet teams and they will have a good idea on the cost/overhead to support one more chain to their wallet. |
I already have a branch largely configured, so close to zero. Anyway, the initial configuration takes a few hours. For testing, we have infrastructure ready to get this started on Rococo and Westend.
Yup, very little.
Testing is not a resource-constrained environment. Testing n^2 cases is not a big deal, still takes less time than running benchmarks. And they are all automated.
Very little, it's a pretty standard system parachain configuration.
Collator nodes are not expensive. I think the Infra bounty for them is something like 100-200 USD / month / collator.
Yes we do. This is easy. Few lines of code in the script. Almost zero overhead.
The goal of Polkadot (and the reason for shared security) was precisely that applications could span multiple chains. Apps need a solution for this and it has nothing to do with system chains.
I've mentioned this several times on the forum and I think there's a W3F Grants RFP for it (@Noc2 ?). Like the point above, yes it's needed but it's needed in general, for all 50+ paras on Polkadot, and really has not much to do with a single system chain. Hopefully this is more incentive to make it. |
simply preventing state changes in the Relay Chain, using the Identity-related state as the genesis | ||
for a new chain, and launching that new chain with the genesis and logic (pallet) needed. | ||
|
||
Other subsystems cannot experience any downtime like this because they are essential to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-existing will be quite tricky for staking, and we would likely have to do a one-off migration. The migration is possible to happen within an era, but even if not, having one extended "super era" is not too bad.
With a one-off migration, all the staking data is moved to the system chain. Some historical staking data will be kept on the RC for retroactive rewards and slashes, and will be removed after a maximum of 84 eras.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this due to mutual exclusivity of freezing on the Relay-chain and the Staking chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed an alternative migration strategy to staking chain by running two parallel staking system for a while.
Roughly the way it would work is:
Staking system in the Relay chain works independently of Staking Chain. Relay Chain has a configuration that decides how many validators are provided from staking chain, the rest elected locally in the Relay Chain. We can then slowly scale up this number to 100% Staking Chain.
Validators who want to validate via Staking Chain, would need to unbond, teleport assets to Staking Chain, and set their intent to validate again (we could potentially improve this by allowing teleport of staked assets without unlocking). In the beginning it may be easy to get into the active validator set via Staking chain until we reach some equilibrium eventually getting to similar economic security as Relay Chain. Once we go to 100% from Staking Chain, at some point we can kick all validators/nominators who did not migrate.
The main opposition to this approach were
- Total stake for active validators (economic security) may drop while in transition phase.
- Validators have to move and re-acquire the approval score they gained on the Relay Chain.
- It might be harder to move all nominators voluntarily to Staking Chain.
We (weakly) reached to the consensus that one-off migration might be simpler. We will need to migrate the staking ledgers and the staked balance to the Staking Chain and do this in one era (which can be an extended era). The era history can stay in Relay Chain allowing validators to claim historical rewards. Once 84 eras (history depth) have passed, we purge the historical data and clean up staking code in relay chain.
Keen to hear if you have some thoughts/feedbacks around these approaches (this probably deserves its own RFC but we do have an open issue where we have some more notes from our discussions so far).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This migration will require a lot of upfront communication. The lesser effort on the validator and user side, the better. I believe that a one-off migration would be much easier to handle on the wallets side, and communication side. Wallets and partners (think CEXs etc) should be prepared, and when the block hits, they'll be able to switch to displaying/using the staking chain. With a gradual move however, I can see a period when users don't know if they are on the RC or the SC, UIs would have to show and handle both, which would likely be very confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wholeheartedly agree with a one-off migration approach, in both this case and as a general pattern for moving systems between chains or performing a large complex upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should obviously not require that users change wallet software to make new accounts, transfer, stake, etc. That's a big no no.
We need domain separation material in signatures of course, but the exact semantics of that domain separation material can evolve if required. If the new chain is the staking chain, then it can have a "virtual" genesis hash of the original relay chain or whatever. We donno how to have multiple staking chains anyways.
I guess governance could be more nuanced.
As for migration, we could've the staking chain "fork" off from the relay chain at block height x, which freezes transactions on the relay chain, then the staking chain runs a migration pallet, and finally the staking chain unfreezes its own transactions. Anyways a chain "fork" operation should not be expensive and should be useful elsewhere.
We'll want a bunch of O(1) drops here, so those should be implemented first. We might clean up the state a bit first too, like fixing the slashing spans system, and droping that old state. As a backup, we proved a limited recovery sudo which kills the staking chain and unfreezes the relay chain accounts.
It's actually kinda unclear if it's easier to migrate once to separate governance and staking chains, or to migrate twice, once to a separate combined chain. and then split off governance later. The migrations involve somewhat different activities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for migration, we could've the staking chain "fork" off from the relay chain at block height x, which freezes transactions on the relay chain, then the staking chain runs a migration pallet, and finally the staking chain unfreezes its own transactions. Anyways a chain "fork" operation should not be expensive and should be useful elsewhere.
What we really need to move is staking pallet state and part of user balance that is staked. Unstaked balances should eventually end up in AssetHub once we get rid of balances from relay chain (and should be part of separate migration I think).
Forking off relay chain actually sounds like a great idea. We can then have a specialised migration pallet that cleans/translates the data that we need, drop what we don't need. Relay chain will freeze all staking related transactions and kill the state that now belongs to staking chain once it knows the staking chain is operational. Does that make sense? This could be a template for similar migrations in future.
@kianenigma had the opposite idea to prepare data that we need to migrate, calculate the state root and securely send it to staking chain (root track or xcm), and then have a bot that fills the storage on staking chain through permissionless calls. I guess though forking off would be simpler?
/rfc propose |
Hey @joepetrowski, here is a link you can use to create the referendum aiming to approve this RFC number 0032. Instructions
It is based on commit hash 87ab0f1be5bafd404e7a1e6f465db027b5271ecd. The proposed remark text is: |
@joepetrowski @gavofyork I see that you added in a new section on Kusama, to practice migration. I was hoping Kusama could be a place for CoreJam + Coreplay experimentation with fewer system chains, a place for Team Scruffy, but its more like Team Neat needs to abort Team Scruffy at birth! I think your coercing #32 into Kusama will deal a blow to Polkadot/Kusama ecosystem, or at least damn Kusama to be nothing more than a "testnet" -- its too big of a bet on the ergonomics / UX all coming together too quickly. How can we have separate #32 referendums for Polkadot (for scalability) and for Kusama (for usability)? I'd like to have a situation where Kusama can live to be MORE than a testnet, where developers have a nice "simple" place to work with the new 2.0 CoreJam + Coreplay programming model, including Availability patterns. If having Kusama be more than a testnet is impossible (because Rococo + Chopsticks "practice" migration is insufficient), I would like to see a THIRD production network, say the "CoreTime network", a production network (relay chain) dedicated to improving 2.0 CoreJam + Coreplay usability for new devs with lower complexity than its Big Brother Polkadot. Can you chart a course? |
The section on Kusama was added insofar as to state how it can be useful in its capacity as canary net in achieving the goals set out in this RFC. Adding a vision and roadmap for Kusama development is way out of scope for this RFC. When CoreJam comes, and it's probably approximately a year to have it in a "Kusama ready" state, I'm sure Kusama will again play a leading role on the frontier. |
I will just speak my thoughts here, hoping to come from a constructive place. I spiritually agree with this proposal, and I am inclined just to vote AYE. At the same time, I certainly know that a single document with < 300 lines cannot capture the underlying complexity, challenges, and decisions that will need to be made once this is actually underway. So I kind of want to understand what we are voting for here. If the vote is to establish direction, then I am a strong AYE. However, I guess I would also expect that different parts of the actual migration process will also manifest as RFCs to be approved and so on. Perhaps one system chain at a time, where even chains like Staking may take multiple RFCs and upgrades to actually get to a final envisioned state. Is that an accurate picture of what this vote is about, or is there some leniency we are supposed to give on how exactly this is all accomplished? |
I'd think the vote says "kusama should eventually do this unless we hit real obstructions". If we later discover it sucks on kusama despite our best efforts, then yeah we might do something slightly different, although more likely we tweak functionality. The vote should not say "we're going to push this through quickly". In particular, we want whatever pattern of migrations happens here, like parachain forking, to be useful for customer parachains when they want to buy multiple slots, without using elastic scaling. We should probably not do this until we can explain what we're doing to customer parachain teams. |
@shawntabrizi yes this is to establish direction. Regarding each subsystem:
I don't think it's entirely clear yet what should be an RFC and what not, though. Of course, changes to the core protocol should be in RFC format. A step-by-step migration plan for Identity? I'm not sure an RFC is more valuable than a detailed GitHub tracking issue. |
@joepetrowski Got it! I can see the level of depth of thought given to the ergonomic/UX considerations voiced by @rphmeier and engineering concerns raised by @xlc, my own RFC #33. I closed #33 in favor of this stand in of a roadmap -- there is no path for Kusama worth developing. Thank you for setting up the foundations so clearly. |
I share similar feeling with @shawntabrizi and that's why I haven't voted. I can see this established a general direction for relaychain and system parachains work, which is indeed useful to have and I agree with this. However, there are not enough technical details to follow so we need additional RFCs to discuss the exact actions. And we are likely to not follow this RFC in case some unexpected obstacle were discovered in future. Quoting myself from a comment from another proposal
and Gav's one
So my real question is, how much technical detail is required for an RFC? Are we ok with high level proposal like this one or something more generic, like we should optimize the performance of relaychain by try to do X, Y and Z. |
Perhaps we should have different kinds of RFCs like: In any case, since this is a "direction" based RFC, and I agree with the direction, I will vote AYE. |
/rfc process |
Please provider a block hash where the referendum confirmation event is to be found.
|
/rfc process 0xa29d08c3030bf527de0fadef3f658ea1e1d3200c4a7530038d693cbe97cab7f3 |
The on-chain referendum has approved the RFC. |
This RFC proposes a direction and prioritization for migrating core functionality off the Relay Chain. The focus is on Identity, Staking, and Governance. The RFC includes a brief discussion on the challenges associated with each one and the most probable migration/implementation path.