Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOP pool synchronization #2211

Merged
merged 36 commits into from
Nov 24, 2023
Merged

Conversation

kziemianek
Copy link
Member

@kziemianek kziemianek commented Oct 24, 2023

Context

resolves: #1935
resolves: P-100

This PR attempts to add TOP pool synchronization through direct trusted rpc server. Trusted calls should be eventually broadcasted to other registered workers and optimistically picked and executed by next block producer.

Changes:

  • author_submitAndWatchExtrinsic JSON-RPC method broadcasts trusted call to other peers
  • adds author_submitAndWatchBroadcastedExtrinsic JSON-RPC method which acts as author_submitAndWatchExtrinsic but doesn't broadcast trusted calls
  • adds TrustedOperationStatus::TopExecuted which is fired upon TOP execution (breaking change)
  • adds ocall get_trusted_peers_urls in order pull trusted peers urls from worker service to enclave runtime

animated

@kziemianek kziemianek added the C0-breaking Breaking change that will cause existing client to break label Oct 24, 2023
@linear
Copy link

linear bot commented Oct 24, 2023

P-100 Linear processing of DI requests

Context

Following up on the discussion in #1920 (comment)

Imagine there are 100 workers and we send a request to worker0, worker0 happens to just finish the its block production so the newly submitted request won't be included in the block for now. Since we have an aura authorship mechanism, the request can be processed after 100 blocks at the earliest, this sounds terrifying.

The root cause is the tx pool of sidechain is not synchronised among nodes, so a DI request can only be stored and processed by the node, to which the request was submitted.

It's a critical issue for multi-worker setup, but I don't think we have time to solve it soon, hence the medium prior.


✔️ Please set appropriate labels and assignees if applicable.

@kziemianek kziemianek requested a review from a team October 28, 2023 01:23
Copy link
Collaborator

@Kailai-Wang Kailai-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't completed the review yet - only took a rough look. Will go into more details later

The animation is well made btw!

tee-worker/core/peer-top-broadcaster/src/lib.rs Outdated Show resolved Hide resolved
tee-worker/core-primitives/top-pool/src/watcher.rs Outdated Show resolved Hide resolved
tee-worker/core/direct-rpc-client/src/lib.rs Show resolved Hide resolved
@Kailai-Wang Kailai-Wang requested a review from a team November 6, 2023 00:05
Copy link
Collaborator

@Kailai-Wang Kailai-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, nicely done!

tee-worker/core/direct-rpc-client/src/lib.rs Outdated Show resolved Hide resolved
tee-worker/core/direct-rpc-client/src/lib.rs Outdated Show resolved Hide resolved
tee-worker/core/direct-rpc-client/src/lib.rs Show resolved Hide resolved
tee-worker/core/direct-rpc-client/src/lib.rs Show resolved Hide resolved
Copy link

@clangenb clangenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, we definitely want to gossip the top pool. I had a superficial look and this PR looks very nice and clean!

I noticed that you gossip the executed TOP to peers, but this is probably not the mechanism to remove it from the synchronized pool? It is done via the sidechain block import, isn't it?

Maybe @brenzi also wants to chime in here.

@kziemianek
Copy link
Member Author

Hey, we definitely want to gossip the top pool. I had a superficial look and this PR looks very nice and clean!

I noticed that you gossip the executed TOP to peers, but this is probably not the mechanism to remove it from the synchronized pool? It is done via the sidechain block import, isn't it?

Maybe @brenzi also wants to chime in here.

Thanks @clangenb for your time and taking a look at this 👍🏼 .

Yes, this change doesn't affect how TOP pool is managed (when TOPs are added or removed). It basically only gossips trusted calls to other known peers and routes back selected responses to the caller. All the rest remains the same.

RPC client could submit trusted call to all peers in the network separately and the efect would be pretty the same.

@brenzi
Copy link

brenzi commented Nov 16, 2023

I think you should take care not to gossip getters or inherents.

Top pool holds tops (TrustedOperation) which can be getters, indirect or direct TrustedCalls

Also, a trusted call can be an inherent/intrinsic(signed by enclave itself) i.e. shield_funds(). These could be gossipped in principle, but we need to think about edge cases

@kziemianek
Copy link
Member Author

kziemianek commented Nov 22, 2023

I think you should take care not to gossip getters or inherents.

Top pool holds tops (TrustedOperation) which can be getters, indirect or direct TrustedCalls

Also, a trusted call can be an inherent/intrinsic(signed by enclave itself) i.e. shield_funds(). These could be gossipped in principle, but we need to think about edge cases

Hi @brenzi, thanks for your advices, here are my thoughts:

  • Only trusted operations submitted through json-rpc method author_submitAndWatch* are gossiped - this means that indirect calls submitted by indirect executor directly to AuthorApi wouldn't be gossiped
  • Only succesfully submitted trusted operations are gossiped. This means that as long as we have correct Author's filter (CallsOnlyFilter) applied getters wouldn't be gossiped.

We can add filtering logic to DirectRpcBroadcaster similar to the one in Author but that would require going through the whole ceremony of decrypting/decoding or move gossiping to Author - might be worth to make sure only direct calls are broadcasted. What do you think @Kailai-Wang ?

@kziemianek
Copy link
Member Author

kziemianek commented Nov 24, 2023

I've moved broadcasting triggering point from json-rpc method to Author, it gives following benefits:

  • broadcasting is encapsulated in Author so other components don't need to be aware about how broadcasting should be performed
  • possibility to filter TrustedCalls so only direct calls are broadcasted

@Kailai-Wang
Copy link
Collaborator

I've moved broadcasting triggering point from json-rpc method to Author, it gives following benefits:

  • broadcasting is encapsulated in Author so other components don't need to be aware about how broadcasting should be performed
  • possibility to filter TrustedCalls so only direct calls are broadcasted

Thank you - I like the incorporation of BroadcastedRequest

And yes I don't think we need to broadcast all kinds of trusted operations:

  • direct trusted call => sure
  • trusted getters => theoretically we should be able to submit getters via author_submitAndWatch* but practically we don't do that. Maybe we should simply disallow submitting getters from this rpc method - or apply filtering. Anyways it makes little sense to gossip it
  • indirect calls => we don't need to gossip it, all workers will get them by syncing the parachain

So I'd say for now we only broadcast direct trusted calls.

Copy link
Collaborator

@Kailai-Wang Kailai-Wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, let's merge it.

BroadcastedTopFilter is passed in as DirectCallsOnlyFilter to initialisation AFAIU, isn't it

@kziemianek
Copy link
Member Author

Looks good, let's merge it.

BroadcastedTopFilter is passed in as DirectCallsOnlyFilter to initialisation AFAIU, isn't it

Yes, this will ensure only direct calls are broadcasted to peers.

@kziemianek
Copy link
Member Author

I've moved broadcasting triggering point from json-rpc method to Author, it gives following benefits:

  • broadcasting is encapsulated in Author so other components don't need to be aware about how broadcasting should be performed
  • possibility to filter TrustedCalls so only direct calls are broadcasted

Thank you - I like the incorporation of BroadcastedRequest

And yes I don't think we need to broadcast all kinds of trusted operations:

  • direct trusted call => sure
  • trusted getters => theoretically we should be able to submit getters via author_submitAndWatch* but practically we don't do that. Maybe we should simply disallow submitting getters from this rpc method - or apply filtering. Anyways it makes little sense to gossip it
  • indirect calls => we don't need to gossip it, all workers will get them by syncing the parachain

So I'd say for now we only broadcast direct trusted calls.

Trusted getters are not added to TOP pool due to filters applied :

#[cfg(feature = "sidechain")]
pub type AuthorTopFilter = crate::top_filter::CallsOnlyFilter;

@kziemianek kziemianek merged commit a486e78 into dev Nov 24, 2023
25 of 26 checks passed
@kziemianek kziemianek deleted the p-100-linear-processing-of-di-requests branch December 4, 2023 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C0-breaking Breaking change that will cause existing client to break
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Linear processing of DI requests
5 participants