-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send and Aggregate RestartHeaviestFork. #699
Send and Aggregate RestartHeaviestFork. #699
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #699 +/- ##
=========================================
+ Coverage 82.0% 82.1% +0.1%
=========================================
Files 860 868 +8
Lines 232898 234084 +1186
=========================================
+ Hits 191071 192365 +1294
+ Misses 41827 41719 -108 |
stake of all the validators which sent me HeaviestFork.
); | ||
return None; | ||
} | ||
self.active_peers.insert(*from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we should move this as close to the insertion into the block_stake_map
as possible, ideally a separate function where these two are done atomically I think so that we can't do one without the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved update of active_peers to later in the function. Right now all operations of heaviest_fork_aggregate.rs are done in a single thread loop in wen_restart, same applies to last_voted_fork_slots_aggregate.rs. That's why we didn't need any locks in here. Otherwise I think moving them into a separate function is not enough, we also need to add locks around them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why we didn't need any locks in here. Otherwise I think moving them into a separate function is not enough, we also need to add locks around them.
Oh I was less concerned about locking/other threads modifying it, and more concerned about code maintainability.
I could easily imagine we might introduce a change that causes an early exit after the active_peers.insert
and before we update the block_stake_map
, which would lead to an inconsistency
wen-restart/src/wen_restart.rs
Outdated
progress_changed = true; | ||
} | ||
if progress_changed { | ||
write_wen_restart_records(wen_restart_path, progress)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessarily expensive to write the entire record every time instead of just the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this content in the file is a snapshot of our progress, it is a bit expensive, but if we change to Gossip changes every 30 minutes maybe the cost is okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the code to write snapshot and send out RestartHeaviestFork every 30 minutes. The downside of writing snapshots every 30 minutes only is that your advertised observed_stake might go back, if somehow some Gossip messages are lost in between restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it written on every update from every potential validator update? So potentially could be hundreds of rewrites in the worst case within a 30 minute period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I see, it was wrapped around in the 30 minute update change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this avoids a bunch of writes, my gut feeling is it would be useful to log or write out every individual validator's heaviest fork for debugging purposes in case we ever get stuck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added info log.
(slot, hash) chosen by us is not the majority choice.
wen-restart/src/wen_restart.rs
Outdated
if total_active_stake_agreed_with_me as f64 * 100.0 / total_stake as f64 | ||
> wait_for_supermajority_threshold_percent as f64 | ||
> (wait_for_supermajority_threshold_percent as f64 - 5.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5.0
Should be a constant we define
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
wen-restart/src/wen_restart.rs
Outdated
* (wait_for_supermajority_threshold_percent as f64 | ||
- HEAVIEST_FORK_DISAGREE_THRESHOLD_PERCENT)) | ||
.round() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just reuse adjusted_threshold_percent
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Push and Aggregate RestartHeaviestFork.