Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send and Aggregate RestartHeaviestFork. #699

Conversation

wen-coding
Copy link

Push and Aggregate RestartHeaviestFork.

@wen-coding wen-coding self-assigned this Apr 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented Apr 10, 2024

Codecov Report

Attention: Patch coverage is 94.14141% with 29 lines in your changes are missing coverage. Please review.

Project coverage is 82.1%. Comparing base (8e331e1) to head (e10467f).
Report is 9 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master     #699     +/-   ##
=========================================
+ Coverage    82.0%    82.1%   +0.1%     
=========================================
  Files         860      868      +8     
  Lines      232898   234084   +1186     
=========================================
+ Hits       191071   192365   +1294     
+ Misses      41827    41719    -108     

wen-restart/proto/wen_restart.proto Outdated Show resolved Hide resolved
wen-restart/proto/wen_restart.proto Outdated Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Outdated Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Outdated Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Outdated Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Outdated Show resolved Hide resolved
);
return None;
}
self.active_peers.insert(*from);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we should move this as close to the insertion into the block_stake_map as possible, ideally a separate function where these two are done atomically I think so that we can't do one without the other

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved update of active_peers to later in the function. Right now all operations of heaviest_fork_aggregate.rs are done in a single thread loop in wen_restart, same applies to last_voted_fork_slots_aggregate.rs. That's why we didn't need any locks in here. Otherwise I think moving them into a separate function is not enough, we also need to add locks around them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why we didn't need any locks in here. Otherwise I think moving them into a separate function is not enough, we also need to add locks around them.

Oh I was less concerned about locking/other threads modifying it, and more concerned about code maintainability.

I could easily imagine we might introduce a change that causes an early exit after the active_peers.insert and before we update the block_stake_map, which would lead to an inconsistency

wen-restart/src/heaviest_fork_aggregate.rs Show resolved Hide resolved
progress_changed = true;
}
if progress_changed {
write_wen_restart_records(wen_restart_path, progress)?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessarily expensive to write the entire record every time instead of just the diff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this content in the file is a snapshot of our progress, it is a bit expensive, but if we change to Gossip changes every 30 minutes maybe the cost is okay?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the code to write snapshot and send out RestartHeaviestFork every 30 minutes. The downside of writing snapshots every 30 minutes only is that your advertised observed_stake might go back, if somehow some Gossip messages are lost in between restarts.

Copy link

@carllin carllin Apr 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it written on every update from every potential validator update? So potentially could be hundreds of rewrites in the worst case within a 30 minute period

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I see, it was wrapped around in the 30 minute update change

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this avoids a bunch of writes, my gut feeling is it would be useful to log or write out every individual validator's heaviest fork for debugging purposes in case we ever get stuck

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added info log.

@wen-coding wen-coding requested a review from carllin April 18, 2024 17:34
if total_active_stake_agreed_with_me as f64 * 100.0 / total_stake as f64
> wait_for_supermajority_threshold_percent as f64
> (wait_for_supermajority_threshold_percent as f64 - 5.0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5.0 Should be a constant we define

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

wen-restart/src/wen_restart.rs Outdated Show resolved Hide resolved
wen-restart/src/wen_restart.rs Outdated Show resolved Hide resolved
wen-restart/src/heaviest_fork_aggregate.rs Outdated Show resolved Hide resolved
Comment on lines 484 to 486
* (wait_for_supermajority_threshold_percent as f64
- HEAVIEST_FORK_DISAGREE_THRESHOLD_PERCENT))
.round() as u64;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just reuse adjusted_threshold_percent here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@wen-coding wen-coding requested a review from carllin April 25, 2024 16:00
carllin
carllin previously approved these changes Apr 25, 2024
@wen-coding wen-coding merged commit 01c4b03 into anza-xyz:master Apr 26, 2024
38 checks passed
@wen-coding wen-coding deleted the wen_restart_push_and_aggregate_heaviest_fork branch April 26, 2024 06:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants