-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Retrieve Key Log IDs via RaftLogReader::get_key_log_ids()
#1264
Conversation
f778ec6
to
2e02fc1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Some comments and questions below, nothing serious.
Reviewed 2 of 9 files at r1, 6 of 8 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer)
openraft/src/engine/leader_log_ids.rs
line 9 at r4 (raw file):
#[derive(Debug, Clone, PartialEq, Eq)] pub(crate) struct LeaderLogIds<C: RaftTypeConfig> { first_last: Option<(LogIdOf<C>, LogIdOf<C>)>,
Isn't this a good use case for a RangeInclusive
?
Suggestion:
range: Option<RangeInclusive<LogIdOf<C>>>,
openraft/src/engine/leader_log_ids.rs
line 26 at r4 (raw file):
where C: RaftTypeConfig { pub(crate) fn new(log_ids: Option<(LogIdOf<C>, LogIdOf<C>)>) -> Self {
Maybe pass Option<RangeInclusive<_>>
?
openraft/src/engine/leader_log_ids.rs
line 32 at r4 (raw file):
/// Used only in tests #[allow(dead_code)] pub(crate) fn new1(log_id: LogIdOf<C>) -> Self {
new1
is a bit strange :-)
Maybe new_single
?
openraft/src/engine/leader_log_ids.rs
line 40 at r4 (raw file):
/// Used only in tests #[allow(dead_code)] pub(crate) fn new2(first: LogIdOf<C>, last: LogIdOf<C>) -> Self {
Similar here, maybe new_range
or with_range
and pass RangeInclusive
?
openraft/src/engine/log_id_list.rs
line 51 at r4 (raw file):
pub(crate) async fn get_key_log_ids<LR>( first: LogId<C::NodeId>, last: LogId<C::NodeId>,
Similarly, here I'd suggest Range
or RangeInclusive
, depending whether last
should be considered.
Code quote:
first: LogId<C::NodeId>,
last: LogId<C::NodeId>,
openraft/src/proposer/candidate.rs
line 114 at r4 (raw file):
}; // TODO: tricky: the new LeaderId is different from the last log id
I don't quite get the reasoning here. Is this about partial range, i.e., where we don't have the log for the previous leader change?
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
/// /// Given: /// Log entries: `[(2,2), (2,3), (5,4), (5,5)]` (format: `(term, index)`)
BTW, what if the log still contains a partial log from the previous leader, which was purged and then re-instantiated after restart?
Maybe we have a design bug, but our log purging is segment-wise and the segment can start at any point in time. So we might have partial log from the term 1 in the log.
OTOH, it should be irrelevant, since that part of log was officially purged, so any lagging follower will be either reinitialized from the log we still have or from the snapshot, right?
If not, we'll have to somehow explicitly store purged log position, which I'd like to prevent.
openraft/src/storage/v2/raft_log_reader.rs
line 110 at r4 (raw file):
&mut self, first: LogId<C::NodeId>, last: LogId<C::NodeId>,
Here I'd also suggest Range
/RangeInclusive
to make it more readable.
Code quote:
first: LogId<C::NodeId>,
last: LogId<C::NodeId>,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 9 files at r1, 7 of 8 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: 4 of 10 files reviewed, 1 unresolved discussion (waiting on @schreter)
openraft/src/proposer/candidate.rs
line 114 at r4 (raw file):
Previously, schreter wrote…
I don't quite get the reasoning here. Is this about partial range, i.e., where we don't have the log for the previous leader change?
The Candidate state only maintains the last log ID, not the complete set of log IDs from the last Leader term. This is sufficient because:
- When converting Candidate to Leader via
Leader::new()
, only the last log ID is needed. - Only the Leader restoration during startup, it uses the first log ID of the last Leader.
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
Previously, schreter wrote…
BTW, what if the log still contains a partial log from the previous leader, which was purged and then re-instantiated after restart?
Maybe we have a design bug, but our log purging is segment-wise and the segment can start at any point in time. So we might have partial log from the term 1 in the log.
OTOH, it should be irrelevant, since that part of log was officially purged, so any lagging follower will be either reinitialized from the log we still have or from the snapshot, right?
If not, we'll have to somehow explicitly store purged log position, which I'd like to prevent.
Could you clarify what you mean by "partial log from the previous leader"?
Regarding log purging: Once Openraft issues a purge request to storage, these logs are logically discarded. Whether the storage implementation physically removes them, keeps them, or if they reappear after restart, does not affect Openraft's behavior. Openraft treats purged logs as non-existent and will never access them again.
89a752d
to
90162d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r5, 3 of 3 files at r6, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @drmingdrmer)
openraft/src/proposer/candidate.rs
line 114 at r4 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
The Candidate state only maintains the last log ID, not the complete set of log IDs from the last Leader term. This is sufficient because:
- When converting Candidate to Leader via
Leader::new()
, only the last log ID is needed.- Only the Leader restoration during startup, it uses the first log ID of the last Leader.
Ack, thanks.
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Could you clarify what you mean by "partial log from the previous leader"?
Regarding log purging: Once Openraft issues a purge request to storage, these logs are logically discarded. Whether the storage implementation physically removes them, keeps them, or if they reappear after restart, does not affect Openraft's behavior. Openraft treats purged logs as non-existent and will never access them again.
Yes, it will never access them again during this invocation. But, say the node crashes and restarts. The logs have been marked as purged in the previous invocation, but suddenly re-appear, because they haven't yet been purged from the persistent storage. Of course, the next snapshot+purge will purge them again.
My only concern is that we have, say, term change 1=>2 mid of log segment 1 and then term change 2=>3 mid of log segment 2. The purge request comes to mid of log segment 2, which only deletes the log segment 1, but the log segment 2 is still in use. So after the restart, half of the logs from the previous term 2 are still present in the log and thus the log ID range for term 2 is effectively wrong. The ranges for term 3 and newer will be correct.
I think this is not a problem, since the lagging follower has either:
- all the logs up to at least start of segment 2, in which case it will be either served the remaining log entries, which are correct and agreed-upon, even though they were logically purged in the previous invocation, or if there is a conflict, it will be correctly detected within this range
- some logs before log segment 2 are missing, in wich case we'll anyway send a snapshot
Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @drmingdrmer)
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
So after the restart, half of the logs from the previous term 2 are still present in the log and thus the log ID range for term 2 is effectively wrong.
I still do not get why there is anything wrong.
I assume the term change 1=>2
you mentioned means the term in a log-id, not a save_vote()
operation that updates the servers' term
.
I do not know the internal structure of the log storage implementation. But I assume any log entries that are left in the storage should have a correct log id that can be read through the RaftLogReader
API.
It looks like there is a segment in which the first half is log entries in term-1, and the other half is log entries in term-2? And what's going wrong?
Key log IDs represent the first log IDs proposed by each Leader. These IDs enable Openraft to efficiently access log IDs at each index with a succinct storage. Previously, key log IDs were obtained using a binary-search-like algorithm through `RaftLogReader`. This commit introduces the `RaftLogReader::get_key_log_ids()` method, allowing implementations to directly return a list of key log IDs if the `RaftLogStorage` can provide them. For backward compatibility, a default implementation using the original binary-search method is provided. No application changes are required when upgrading to this version. Tests verifying the implementation are included in `openraft::testing::log::suite::Suite`. - Fixes: databendlabs#1261
90162d0
to
9178ef8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
So after the restart, half of the logs from the previous term 2 are still present in the log and thus the log ID range for term 2 is effectively wrong.
I still do not get why there is anything wrong.
I assume the
term change 1=>2
you mentioned means the term in a log-id, not asave_vote()
operation that updates the servers'term
.I do not know the internal structure of the log storage implementation. But I assume any log entries that are left in the storage should have a correct log id that can be read through the
RaftLogReader
API.It looks like there is a segment in which the first half is log entries in term-1, and the other half is log entries in term-2? And what's going wrong?
There is nothing going wrong and all logs do have proper log ID persistently stored. My expectation is that everything is just fine as it is (we don't see any issues).
We store the log in segments and at purge time, we throw away entire segments which end before purge position. I.e., if a purge is requested mid of segment, it's kept untouched.
It was just a question to better understand log purging requirements. Basically, is it fine that after purging the log and restart of the node an arbitrary part of the log before the purge position "reappears"?
I.e., the last_purged as returned in initial state of the state machine/log reader goes backward after the node restart compared to previously-requested purge position. The log in the range that reappears is correctly persisted with correct log IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
Previously, schreter wrote…
There is nothing going wrong and all logs do have proper log ID persistently stored. My expectation is that everything is just fine as it is (we don't see any issues).
We store the log in segments and at purge time, we throw away entire segments which end before purge position. I.e., if a purge is requested mid of segment, it's kept untouched.
It was just a question to better understand log purging requirements. Basically, is it fine that after purging the log and restart of the node an arbitrary part of the log before the purge position "reappears"?
I.e., the last_purged as returned in initial state of the state machine/log reader goes backward after the node restart compared to previously-requested purge position. The log in the range that reappears is correctly persisted with correct log IDs.
Purged logs have been fully committed and applied to a persisted snapshot. If they reappear, it won't impact the system's state, and no issues will occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved
openraft/src/storage/v2/raft_log_reader.rs
line 79 at r4 (raw file):
Purged logs have been fully committed and applied to a persisted snapshot. If they reappear, it won't impact the system's state, and no issues will occur.
Yes, that's exactly also my understanding and expectation, thanks for the confirmation :-).
Changelog
Feature: Retrieve Key Log IDs via
RaftLogReader::get_key_log_ids()
Key log IDs represent the first log IDs proposed by each Leader. These
IDs enable Openraft to efficiently access log IDs at each index with
a succinct storage.
Previously, key log IDs were obtained using a binary-search-like
algorithm through
RaftLogReader
. This commit introduces theRaftLogReader::get_key_log_ids()
method, allowing implementations todirectly return a list of key log IDs if the
RaftLogStorage
canprovide them.
For backward compatibility, a default implementation using the original
binary-search method is provided. No application changes are required
when upgrading to this version.
Tests verifying the implementation are included in
openraft::testing::log::suite::Suite
.LogIdList::load_log_ids
by providing the information directly from the state machine or log reader #1261This change is