Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD-0046: Optimistic cluster restart automation #46
base: main
Are you sure you want to change the base?
SIMD-0046: Optimistic cluster restart automation #46
Changes from 16 commits
2e99b5b
03277f1
7b71cad
996d1ab
de72626
8f32e6c
4feeb64
0aff4cd
e29d83c
b0b2d47
838429d
1049463
6e4a5cd
8cb6ef6
df5932d
5050a7c
90134b5
85f62b4
eafd745
ecccadf
e143136
57b3b16
4b99230
198e742
7bd9b74
a44eeff
6147af1
bffcd1d
ebc0cec
dc9209f
eefa087
4145325
6aeda83
78c0e5b
a938546
63d3252
ebd1935
deee8ec
9f013a0
5420a17
323ee80
3b3ef2d
87813b8
5b69c78
fac8526
351e675
b130ee3
3234699
76f63bb
18b3d87
813c2cf
2c21911
3fd02f1
a9447b4
e913703
eb359ac
82fcd22
192b01c
c4d3e3e
8a9990d
3167a08
21878c8
879e92d
5b58b8a
d817520
fdc7534
6b2a0b2
6fcc5cf
7614587
ba1c9d4
2dffa19
5635ad3
7adb22b
16e4ec8
3fa3b9a
b6fc273
005caae
5fcbcd1
8f7f752
50d5bcb
397e98b
6190f35
e4e8d84
51e81d9
ac0940f
3805d7a
0466a12
6613293
dd60570
b415733
acb041f
d85ce34
07cbb0c
eb99eac
30a4d38
28373b0
9558fcb
1e9ea74
4ceebbb
f0d933c
821456d
891d5ea
613384d
75306bd
bafaac5
f65c6aa
31c358e
bf5529f
6fb1cf7
9baa635
4f5469c
22dfc57
560dd4d
683a43a
a722bc6
4e531bb
4ac10f2
cfbcb5b
eb81566
a9ac86c
b8185eb
fac38b6
7c9d098
79b3be7
ab33197
331fe01
d22ba4a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80%
is an implementation detail. it would be better presented as a variable with numerical value declared once, in case there is discussion to be had about the chosen value. this is an issue throughout the documentThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I added RESTART_STAKE_THRESHOLD in the terminologies section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the static "80%" strings throughout some other variable or do they just need to be replaced still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I changed what should be changed, the rest are concrete numbers in examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this heading makes me think the section should be a numbered list with a more general heading along the lines "repair ledgers up to the restart slot"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bah! looks like md formatting isn't helping us much here. just renders as a giant wall of text. gonna have to switch to manual mode i think...
something like this will be a little easier on the eyes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this calculation.
What if the other
100% - 42% = 58%
pick some other block?Why should the minority
42%
block be optimistically confirmed?Why should ever a block with less than
67%
vote be optimistically confirmed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to prevent false negative (if a slot was oc'ed before the restart, you must pick it here), not to prevent false positive (it's okay if we pick a slot here which isn't oc'ed). Because when we select Heaviest later we should see the competing fork and count the votes accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prolly add the motivation and justification for these values to the document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here, I am not sure why it is safe to go below 67%?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to prevent false negative at all costs and it's okay to have false positive. Let's say X is the first block having only 62% but not 67%, we know if 75% of the validators decide to pick this fork, it will be instantly oc'ed and we won't kick another oc'ed slot out. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, add the motivation and justification in the doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's give this message a more descriptive name.
all of these messages should probably have a common prefix as well.
so maybe something like
ClusterRestartHeaviestCandidateFork
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems really long, and I'm wondering whether we can later use these messages outside restart. For now I'm just renaming Heaviest to HeaviestFork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo this needs to have cluster-restart context in the name. remember the gossip is general and used for much more than this special purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to RestartHeaviestFork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't see what this ten minute wait is gaining us? if there is sufficient gossip message propagation delay as to destabilize the restart, we're just going to have the same instability, only ten minutes later.
assuming the intention is to elide any gossip propagation delay, perhaps it would be more effective to synchronize on wall time by breaking it into five minute intervals and waiting for the next full interval to expire? that way we're waiting somewhere between
9m59s
and5m00s
, starting within the max skew of the restart set's system clocks.i believe we already have some excessive wallclock skew warnings built into gossip (or elsewhere in the startup procedure?), so hopefully it's been corrected for all nodes and not actually worse than any gossip propagation delay 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid the situation where 80saw80 isn't propagated to all validators in restart before Gossip peers all go into restart mode. Because currently CRDS_GOSSIP_PUSH_MSG_TIMEOUT_MS is 30 seconds, so after 80% received Heaviest, the message takes at most 30 seconds to propagate to everyone. I think waiting for 2 minutes should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my point was that using a static delay doesn't do anything to improve synchronization because the nodes will all start and end that sleep out of sync by the gossip network delay. the suggestion to delay until fixed wall clock intervals instead of a static delay will improve the situation so long as wall clock skew across the cluster is tighter than the gossip network delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's good point, changed. Also the interval is 2 minutes now.