Skip to content

Commit

Permalink
Change: Replace loosen-follower-log-revert feature flag with `Confi…
Browse files Browse the repository at this point in the history
…g::allow_log_reversion`

The `loosen-follower-log-revert` feature flag is removed in favor of a
runtime-configurable option, `Config::allow_log_reversion`. This change
allows log reversion to be controlled dynamically via application
configuration rather than at compile time.

When `Config::allow_log_reversion` is enabled, log reversion on a
Follower is always permitted. Upon detecting a reversion, the Leader
will reset replication from the beginning instead of panicking.

### Breaking Change:

Since this commit, compiling with `--features
loosen-follower-log-revert` will now result in the following error:

```text
error: The feature flag `loosen-follower-log-revert` is removed since version `0.10.0`.
       Use `Config::allow_log_reversion` instead.
```

Upgrade tip:

For applications relying on `loosen-follower-log-revert`:

1. **Remove this feature flag** from `Cargo.toml`.
2. **Enable log reversion in your configuration** as follows:

```rust
let config = Config {
    allow_log_reversion: Some(true),
    ..Default::default()
};
```
  • Loading branch information
drmingdrmer committed Dec 2, 2024
1 parent 9178ef8 commit c8813d8
Show file tree
Hide file tree
Showing 19 changed files with 206 additions and 105 deletions.
3 changes: 0 additions & 3 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -313,9 +313,6 @@ jobs:
- toolchain: "nightly"
features: "single-term-leader"

- toolchain: "nightly"
features: "loosen-follower-log-revert"


steps:
- name: Setup | Checkout
Expand Down
3 changes: 2 additions & 1 deletion openraft/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,8 @@ singlethreaded = ["openraft-macros/singlethreaded"]
# useful for testing or other special scenarios.
# For instance, in an even number nodes cluster, erasing a node's data and then
# rebooting it(log reverts to empty) will not result in data loss.
#
# This feature is removed since `0.10.0`. Use `Config::allow_log_reversion` instead.
loosen-follower-log-revert = []


Expand All @@ -110,7 +112,6 @@ tracing-log = [ "tracing/log" ]
features = [
"bt",
"compat",
"loosen-follower-log-revert",
"serde",
"tracing-log",
]
Expand Down
28 changes: 28 additions & 0 deletions openraft/src/config/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,26 @@ pub struct Config {
default_missing_value = "true"
)]
pub enable_elect: bool,

/// Whether to allow to reset the replication progress to `None`, when the
/// follower's log is found reverted to an early state. **Do not enable this in production**
/// unless you know what you are doing.
///
/// Although log state reversion is typically seen as a bug, enabling it can be
/// useful for testing or other special scenarios.
/// For instance, in an even number nodes cluster, erasing a node's data and then
/// rebooting it(log reverts to empty) will not result in data loss.
///
/// For one-shot log reversion, use
/// [`Raft::trigger().allow_next_revert()`](crate::raft::trigger::Trigger::allow_next_revert).
///
/// Since: 0.10.0
#[clap(long,
action = clap::ArgAction::Set,
num_args = 0..=1,
default_missing_value = "true"
)]
pub allow_log_reversion: Option<bool>,
}

/// Updatable config for a raft runtime.
Expand Down Expand Up @@ -276,6 +296,14 @@ impl Config {
}
}

/// Whether to allow the replication to reset the state to `None` when a log state reversion is
/// detected.
///
/// By default, it does not allow log reversion, because it might indicate a bug in the system.
pub(crate) fn get_allow_log_reversion(&self) -> bool {
self.allow_log_reversion.unwrap_or(false)
}

/// Build a `Config` instance from a series of command line arguments.
///
/// The first element in `args` must be the application name.
Expand Down
28 changes: 28 additions & 0 deletions openraft/src/config/config_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,31 @@ fn test_config_enable_elect() -> anyhow::Result<()> {

Ok(())
}

#[test]
fn test_config_allow_log_reversion() -> anyhow::Result<()> {
let config = Config::build(&["foo", "--allow-log-reversion=false"])?;
assert_eq!(Some(false), config.allow_log_reversion);

let config = Config::build(&["foo", "--allow-log-reversion=true"])?;
assert_eq!(Some(true), config.allow_log_reversion);

let config = Config::build(&["foo", "--allow-log-reversion"])?;
assert_eq!(Some(true), config.allow_log_reversion);

let mut config = Config::build(&["foo"])?;
assert_eq!(None, config.allow_log_reversion);

// test allow_log_reversion method

config.allow_log_reversion = None;
assert_eq!(false, config.get_allow_log_reversion());

config.allow_log_reversion = Some(true);
assert_eq!(true, config.get_allow_log_reversion());

config.allow_log_reversion = Some(false);
assert_eq!(false, config.get_allow_log_reversion());

Ok(())
}
7 changes: 4 additions & 3 deletions openraft/src/docs/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ behavior is **undefined**.
### Can I wipe out the data of ONE node and wait for the leader to replicate all data to it again?

Avoid doing this. Doing so will panic the leader. But it is permitted
if [`loosen-follower-log-revert`] feature flag is enabled.
with config [`Config::allow_log_reversion`] enabled.

In a raft cluster, although logs are replicated to multiple nodes,
wiping out a node and restarting it is still possible to cause data loss.
Expand All @@ -211,7 +211,7 @@ N5 | elect L2

But for even number nodes cluster, Erasing **exactly one** node won't cause data loss.
Thus, in a special scenario like this, or for testing purpose, you can use
`--feature loosen-follower-log-revert` to permit erasing a node.
[`Config::allow_log_reversion`] to permit erasing a node.


### Is Openraft resilient to incorrectly configured clusters?
Expand All @@ -237,7 +237,6 @@ pub(crate) fn following_handler(&mut self) -> FollowingHandler<C> {
```


[`loosen-follower-log-revert`]: `crate::docs::feature_flags#loosen_follower_log_revert`
[`single-term-leader`]: `crate::docs::feature_flags#single_term_leader`

[`Linearizable Read`]: `crate::docs::protocol::read`
Expand All @@ -246,6 +245,8 @@ pub(crate) fn following_handler(&mut self) -> FollowingHandler<C> {
[`BasicNode`]: `crate::node::BasicNode`
[`RaftTypeConfig`]: `crate::RaftTypeConfig`

[`Config::allow_log_reversion`]: `crate::config::Config::allow_log_reversion`

[`RaftLogStorage::save_committed()`]: `crate::storage::RaftLogStorage::save_committed`

[`RaftNetwork`]: `crate::network::RaftNetwork`
Expand Down
1 change: 0 additions & 1 deletion openraft/src/docs/feature_flags/feature-flags-toc.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
- [feature-flag `bench`](#feature-flag-bench)
- [feature-flag `bt`](#feature-flag-bt)
- [feature-flag `compat`](#feature-flag-compat)
- [feature-flag `loosen-follower-log-revert`](#feature-flag-loosen-follower-log-revert)
- [feature-flag `serde`](#feature-flag-serde)
- [feature-flag `single-term-leader`](#feature-flag-single-term-leader)
- [feature-flag `singlethreaded`](#feature-flag-singlethreaded)
Expand Down
9 changes: 0 additions & 9 deletions openraft/src/docs/feature_flags/feature-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,6 @@ This feature works ONLY with nightly rust, because it requires unstable feature

Enables compatibility supporting types.

## feature-flag `loosen-follower-log-revert`

Permit the follower's log to roll back to an earlier state without causing the leader to panic.
Although log state reversion is typically seen as a bug, enabling it can be useful for testing or other special scenarios.
For instance, in an even number nodes cluster,
erasing a node's data and then rebooting it(log reverts to empty) will not result in data loss.

**Do not use it unless you know what you are doing**.

## feature-flag `serde`

Derives `serde::Serialize, serde::Deserialize` for type that are used
Expand Down
5 changes: 5 additions & 0 deletions openraft/src/engine/engine_config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ pub(crate) struct EngineConfig<C: RaftTypeConfig> {
/// The maximum number of entries per payload allowed to be transmitted during replication
pub(crate) max_payload_entries: u64,

pub(crate) allow_log_reversion: bool,

pub(crate) timer_config: time_state::Config,
}

Expand All @@ -39,6 +41,8 @@ where C: RaftTypeConfig
max_in_snapshot_log_to_keep: config.max_in_snapshot_log_to_keep,
purge_batch_size: config.purge_batch_size,
max_payload_entries: config.max_payload_entries,
allow_log_reversion: config.get_allow_log_reversion(),

timer_config: time_state::Config {
election_timeout,
smaller_log_timeout: Duration::from_millis(config.election_timeout_max * 2),
Expand All @@ -55,6 +59,7 @@ where C: RaftTypeConfig
max_in_snapshot_log_to_keep: 1000,
purge_batch_size: 256,
max_payload_entries: 300,
allow_log_reversion: false,
timer_config: time_state::Config::default(),
}
}
Expand Down
11 changes: 8 additions & 3 deletions openraft/src/engine/handler/replication_handler/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use crate::engine::EngineOutput;
use crate::engine::ReplicationProgress;
use crate::error::NodeNotFound;
use crate::error::Operation;
use crate::progress;
use crate::progress::entry::ProgressEntry;
use crate::progress::Inflight;
use crate::progress::Progress;
Expand Down Expand Up @@ -163,7 +164,9 @@ where C: RaftTypeConfig
let quorum_accepted = self
.leader
.progress
.update_with(&node_id, |prog_entry| prog_entry.update_matching(log_id))
.update_with(&node_id, |prog_entry| {
prog_entry.new_updater(&*self.config).update_matching(log_id)
})
.expect("it should always update existing progress")
.clone();

Expand Down Expand Up @@ -215,7 +218,9 @@ where C: RaftTypeConfig

let prog_entry = self.leader.progress.get_mut(&target).unwrap();

prog_entry.update_conflicting(conflict.index);
let mut updater = progress::entry::update::Updater::new(self.config, prog_entry);

updater.update_conflicting(conflict.index);
}

/// Enable one-time replication reset for a specific node upon log reversion detection.
Expand All @@ -241,7 +246,7 @@ where C: RaftTypeConfig
return Err(NodeNotFound::new(target, Operation::AllowNextRevert));
};

prog_entry.reset_on_reversion = allow;
prog_entry.allow_log_reversion = allow;

Ok(())
}
Expand Down
4 changes: 2 additions & 2 deletions openraft/src/engine/tests/startup_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ fn test_startup_as_leader_without_logs() -> anyhow::Result<()> {
matching: None,
inflight: Inflight::None,
searching_end: 4,
reset_on_reversion: false,
allow_log_reversion: false,
})]
},
Command::AppendInputEntries {
Expand Down Expand Up @@ -130,7 +130,7 @@ fn test_startup_as_leader_with_proposed_logs() -> anyhow::Result<()> {
matching: None,
inflight: Inflight::None,
searching_end: 7,
reset_on_reversion: false,
allow_log_reversion: false,
})]
},
Command::Replicate {
Expand Down
6 changes: 6 additions & 0 deletions openraft/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ macro_rules! func_name {
}};
}

#[cfg(feature = "loosen-follower-log-revert")]
compile_error!(
"The feature flag `loosen-follower-log-revert` is removed since `0.10.0`. \
Use `Config::allow_log_reversion` instead."
);

pub extern crate openraft_macros;

mod change_members;
Expand Down
84 changes: 11 additions & 73 deletions openraft/src/progress/entry/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
pub(crate) mod update;

use std::borrow::Borrow;
use std::error::Error;
use std::fmt::Debug;
Expand All @@ -7,6 +9,8 @@ use std::fmt::Formatter;
use validit::Validate;

use crate::display_ext::DisplayOptionExt;
use crate::engine::EngineConfig;
use crate::progress::entry::update::Updater;
use crate::progress::inflight::Inflight;
use crate::raft_state::LogStateReader;
use crate::LogId;
Expand Down Expand Up @@ -37,7 +41,7 @@ where C: RaftTypeConfig
/// to it.
///
/// This flag will be cleared after the progress entry is reset.
pub(crate) reset_on_reversion: bool,
pub(crate) allow_log_reversion: bool,
}

impl<C> ProgressEntry<C>
Expand All @@ -49,7 +53,7 @@ where C: RaftTypeConfig
matching: matching.clone(),
inflight: Inflight::None,
searching_end: matching.next_index(),
reset_on_reversion: false,
allow_log_reversion: false,
}
}

Expand All @@ -61,7 +65,7 @@ where C: RaftTypeConfig
matching: None,
inflight: Inflight::None,
searching_end: end,
reset_on_reversion: false,
allow_log_reversion: false,
}
}

Expand All @@ -74,6 +78,10 @@ where C: RaftTypeConfig
self
}

pub(crate) fn new_updater<'a>(&'a mut self, engine_config: &'a EngineConfig<C>) -> Updater<'a, C> {
Updater::new(engine_config, self)
}

/// Return if a range of log id `..=log_id` is inflight sending.
///
/// `prev_log_id` is never inflight.
Expand All @@ -88,76 +96,6 @@ where C: RaftTypeConfig
}
}

pub(crate) fn update_matching(&mut self, matching: Option<LogId<C::NodeId>>) {
tracing::debug!(
self = display(&self),
matching = display(matching.display()),
"update_matching"
);

self.inflight.ack(matching.clone());

debug_assert!(matching >= self.matching);
self.matching = matching;

let matching_next = self.matching.next_index();
self.searching_end = std::cmp::max(self.searching_end, matching_next);
}

/// Update conflicting log index.
///
/// Conflicting log index is the last found log index on a follower that is not matching the
/// leader log.
///
/// Usually if follower's data is lost, `conflict` is always greater than or equal `matching`.
/// But for testing purpose, a follower is allowed to clean its data and wait for leader to
/// replicate all data to it.
///
/// To allow a follower to clean its data, enable feature flag [`loosen-follower-log-revert`] .
///
/// [`loosen-follower-log-revert`]: crate::docs::feature_flags#feature_flag_loosen_follower_log_revert
pub(crate) fn update_conflicting(&mut self, conflict: u64) {
tracing::debug!(self = debug(&self), conflict = display(conflict), "update_conflict");

self.inflight.conflict(conflict);

debug_assert!(conflict < self.searching_end);
self.searching_end = conflict;

// An already matching log id is found lost:
//
// - If log reversion is allowed, just restart the binary search from the beginning.
// - Otherwise, panic it.

let allow_reset = if cfg!(feature = "loosen-follower-log-revert") {
true
} else {
self.reset_on_reversion
};

if allow_reset {
if conflict < self.matching.next_index() {
tracing::warn!(
"conflict {} < last matching {}: follower log is reverted; with 'loosen-follower-log-revert' enabled, this is allowed.",
conflict,
self.matching.display(),
);

self.matching = None;
self.reset_on_reversion = false;
}
} else {
debug_assert!(
conflict >= self.matching.next_index(),
"follower log reversion is not allowed \
without `--features loosen-follower-log-revert`; \
matching: {}; conflict: {}",
self.matching.display(),
conflict
);
}
}

/// Initialize a replication action: sending log entries or sending snapshot.
///
/// If there is an action in progress, i.e., `inflight` is not None, it returns an `Err`
Expand Down
Loading

0 comments on commit c8813d8

Please sign in to comment.