Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): proof fetching on state update for StateRootTask #12458

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

fgimenez
Copy link
Member

@fgimenez fgimenez commented Nov 11, 2024

Towards #12053
Requires #12378 #12467 #12510

@fgimenez fgimenez added C-perf A change motivated by improving speed, memory usage or disk footprint A-blockchain-tree Related to sidechains, reorgs and pending blocks A-trie Related to Merkle Patricia Trie implementation labels Nov 11, 2024
@fgimenez fgimenez force-pushed the fgimenez/srt-sync-async-impls branch from d0f48d5 to fdb55e9 Compare November 12, 2024 15:34
@fgimenez fgimenez force-pushed the fgimenez/srt-sync-async-impls branch from fdb55e9 to 5d2048f Compare November 13, 2024 13:34
Base automatically changed from fgimenez/srt-sync-async-impls to main November 18, 2024 14:19
@fgimenez fgimenez marked this pull request as ready for review November 18, 2024 15:50
@@ -237,6 +241,7 @@ impl From<ParallelStateRootError> for ProviderError {
ParallelStateRootError::StorageRoot(StorageRootError::Database(error)) => {
Self::Database(error)
}
ParallelStateRootError::Other(other) => Self::Database(DatabaseError::Other(other)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be casted to database error. out of scope for this PR, but we have to fix this at some point ^^

@fgimenez fgimenez force-pushed the fgimenez/srt-proof-fetching branch 4 times, most recently from 52e1366 to 2ffe7c8 Compare November 21, 2024 09:43
crates/engine/tree/src/tree/root.rs Outdated Show resolved Hide resolved

pending_proofs.push_back(rx);

state.extend(hashed_state_update);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this extends the state immediately and that's what is used to perform state root updates, but this might result in invalid cached trie nodes. not sure yet, but worth double checking @fgimenez

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few questions

Comment on lines 222 to 223
// try to receive state updates without blocking
match self.state_stream.rx.try_recv() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now a busy loop, right?
ideally we can use just recv here because then this can yield

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, if we recv here we would be blocking until all the state updates are received, ie. until the block is executed, and the benefits of the task would be diminished? the proof calculation for each state root change would be triggered, but we would not trigger the state root calculation from proofs.

Comment on lines 305 to 314
rayon::spawn(move || {
let result = calculate_state_root_from_proofs(
view,
&input_nodes_sorted,
&input_state_sorted,
multiproof,
state,
);
let _ = tx.send(result);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this spawns a new rayon job that then also branches out via par_iter
all of this looks very IO heavy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's the case

}

fn run(mut self) -> StateRootResult {
let mut task_state = StateRootTaskState::Idle(MultiProof::default(), B256::default());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not super easy to follow how state management is done here

looks like we're sending a few things around and need to await things from different event variants from different sources, hence the try_recv to not block indefinitely on one kind of receiver?

I assume we can make this work with only one channel if we unify the message variants into one single enum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we're sending a few things around and need to await things from different event variants from different sources, hence the try_recv to not block indefinitely on one kind of receiver?

yes, exactly

I assume we can make this work with only one channel if we unify the message variants into one single enum?

yep sounds good, will prepare changes for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done, the main changes are:

  • we have now a StateRootMessage enum with all the message variants for state change inside the task.
  • the task creates an internal channel for sending and receive messages of type StateRootMessage
  • the task also creates a background thread that just receives items from the input state stream, wraps them into the correspondent StateRootMessage variant and forwards them to the internal StateRootMessage channel. It calls recv in the external channel receiver, so that it yields
  • run now calls recv in the internal StateRootMessage receiver (yielding) and processes each message
  • there's a separate ProofSequencer type to manage proofs in order.

@mattsse ptal, still not tested on the mainnet machine, but local checks are looking good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-blockchain-tree Related to sidechains, reorgs and pending blocks A-trie Related to Merkle Patricia Trie implementation C-perf A change motivated by improving speed, memory usage or disk footprint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants