-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sequencer drain in challenging situations #2673
Conversation
Test Results 7 files ±0 7 suites ±0 3m 25s ⏱️ -52s Results for commit 92a6966. ± Comparison against base commit bb09c3d. This pull request removes 2 tests.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for fixing this problem so quickly @AhmedSoliman. I would have cost me quite some sleep to not know what is causing the stuck situation. The changes look good to me. +1 for merging after fixing the failing bifrost_append_and_seal_concurrent
test. From quickly skimming over it, it seems that the test assumes that Bifrost::append
calls cannot fail (they should probably tolerate Error::Shutdown
now).
crates/bifrost/src/providers/replicated_loglet/sequencer/appender.rs
Outdated
Show resolved
Hide resolved
@@ -279,6 +284,8 @@ impl<T: TransportConnect> SequencerAppender<T> { | |||
}); | |||
} | |||
|
|||
// NOTE: It's very important to keep this look cancellation safe. If the appender future | |||
// was cancelled, we don't want to move the global commit offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR: When seeing a StoreTaskStatus::Sealed
could we exit the store_tasks.join_next()
early if we have reached fmajority?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only saves us, at best ~2s (the store timeout) but adds a f-majority check on every append which I'm not sure if it's worth the extra cost tbh. In the next PRs there will be some changes to this anyway to fix a few other issues.
crates/bifrost/src/providers/replicated_loglet/sequencer/appender.rs
Outdated
Show resolved
Hide resolved
@@ -149,7 +147,14 @@ impl<T: TransportConnect> SequencerAppender<T> { | |||
State::Done | State::Cancelled | State::Sealed => break state, | |||
State::Wave { graylist } => { | |||
self.current_wave += 1; | |||
let Some(next_state) = cancellation | |||
// # Why is this cancellation safe? | |||
// Because we don't await any futures inside the join_next() loop, so we are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it be a problem if we awaited futures inside the join_next() loop? If I understand things correctly, then the invariant that must hold is that we only acknowledge the write after we have replicated and updated the tail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try and explain it more in the comment. The invariant you mentioned if correct, but theoretically, if we have an await after updating the global tail, and still marked this append as cancelled, then other appenders after this one might be unblocked and finish their quorum write and therefore reporting success. This means that the writer has a hole in the log. This will not happen with the current one-by-one design, hence the "theoretical" bit.
@tillrohrmann thanks for the review. Yes, your conclusion on the failing test is spot-on. Next PR will fix the test issue(s) |
Stack created with Sapling. Best reviewed with ReviewStack.