-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: store heartbeat metric in RaftMetrics and RaftDataMetrics #1177
Feature: store heartbeat metric in RaftMetrics and RaftDataMetrics #1177
Conversation
It looks like you can not use The metrics may have to store the interval since the last acked heartbeat, just like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @SteveLauC)
openraft/src/progress/entry/mod.rs
line 26 at r1 (raw file):
/// The id of the last matching log on the target following node. pub(crate) matching: Option<LogId<C::NodeId>>, pub(crate) time: Option<InstantOf<C>>,
You do not need to store the ack time. there is already a field for it:
https://github.com/datafuselabs/openraft/blob/55d009c897d689534cb5cc3b6321b1f42d9b2066/openraft/src/proposer/leader.rs#L61
clock_progress
stores the last timestamp which is acked by each follower/learner.
It can be transform to a BTreeMap as it does with progress -> replications
:
Thanks for the review and guiding me!
Done
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 7 files at r2, all commit messages.
Reviewable status: 6 of 7 files reviewed, 2 unresolved discussions (waiting on @drmingdrmer and @SteveLauC)
openraft/src/progress/entry/mod.rs
line 26 at r1 (raw file):
/// The id of the last matching log on the target following node. pub(crate) matching: Option<LogId<C::NodeId>>, pub(crate) time: Option<InstantOf<C>>,
BTW, independent of the solution, naming a member time
and especially w/o documentation is very confusing. Always add a documentation comment telling what this is and name members and methods appropriately.
For instance, this might be last_known_alive_timestamp
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to create a test to verify the functionality of these features.
Please include the heartbeat tests in this module:
https://github.com/datafuselabs/openraft/blob/ca156ead4af0624e990d1e1e069aab929a5379a3/tests/tests/metrics/main.rs#L10
Reviewed 7 of 7 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @SteveLauC)
openraft/src/metrics/raft_metrics.rs
line 77 at r2 (raw file):
/// The heartbeats states. It is Some() only when this node is leader. pub heartbeat: Option<HeartbeatMetrics<C>>,
Please enhance the documentation to clarify the meaning of the value. For example: "The value represents the interval since the timestamp of the last heartbeat acknowledged by a follower."
Yeah, I will handle them, just want to ensure this PR is basically ok before doing these things.
|
Hi, I just added the code documents.
Should I add tests in this module or the module $ rg "replication"
t30_leader_metrics.rs
52: if let Some(ref q) = x.replication {
61: "no replication with 1 node cluster",
95: if let Some(ref q) = x.replication {
102: "replication metrics to 4 nodes",
128: "--- replication metrics should reflect the replication state"
137: if let Some(ref q) = x.replication {
144: "replication metrics to 3 nodes",
159: n1.wait(timeout()).metrics(|x| x.replication.is_some(), "node-1 starts replication").await?;
161: n0.wait(timeout()).metrics(|x| x.replication.is_none(), "node-0 stopped replication").await?; Or maybe it should be added in |
let mut data_metrics = leader.data_metrics(); | ||
let mut prev_node1 = 0; | ||
let mut prev_node2 = 0; | ||
for _ in 0..30 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, since the heartbeat is disabled, the last ack time of node 1 and node 2 should never change, so I expect the time since the last ack should increase during the loop, but it won't sometimes, I am not sure about the reason:<
running 1 test
panicked at tests/tests/metrics/t10_server_metrics_and_data_metrics.rs:109:13:
interval should increase since last ack won't change
tests/tests/metrics/t10_server_metrics_and_data_metrics.rs:109:13
backtrace is disabled without --features 'bt'
test t10_server_metrics_and_data_metrics::heartbeat_metrics ... FAILED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any change to RaftMetrics
would result in changed()
to be ready, it is possible other fields changed but the heartbeat
did not change.
I dumped the node-0 metrics from the log (at the bottom of the page):
cat ut.2024-07-19-02 | grep heartbeat_metrics | grep report_metrics:.* -o | grep id:0
But it looks lite impl Display for RaftMetrics
is not updated I can not see the heartbeat
field:
report_metrics: Metrics{id:0, Learner, term:0, vote:<T0-N0:->, last_log:None, last_applied:None, leader:None(since_last_ack:None ms), membership:{log_id:None, {voters:[], learners:[]}}, snapshot:None, purged:None, replication:{}}
report_metrics: Metrics{id:0, Learner, term:0, vote:<T0-N0:->, last_log:None, last_applied:None, leader:None(since_last_ack:None ms), membership:{log_id:None, {voters:[], learners:[]}}, snapshot:None, purged:None, replication:{}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:1, last_applied:None, leader:0(since_last_ack:0 ms), membership:{log_id:T0-N0.0, {voters:[{0:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.1}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:1, last_applied:T1-N0.1, leader:0(since_last_ack:0 ms), membership:{log_id:T0-N0.0, {voters:[{0:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.1}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:2, last_applied:T1-N0.1, leader:0(since_last_ack:0 ms), membership:{log_id:T1-N0.2, {voters:[{0:()}], learners:[1:()]}}, snapshot:None, purged:None, replication:{0:T1-N0.2,1:None}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:2, last_applied:T1-N0.2, leader:0(since_last_ack:0 ms), membership:{log_id:T1-N0.2, {voters:[{0:()}], learners:[1:()]}}, snapshot:None, purged:None, replication:{0:T1-N0.2,1:None}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:3, last_applied:T1-N0.2, leader:0(since_last_ack:0 ms), membership:{log_id:T1-N0.3, {voters:[{0:()}], learners:[1:(),2:()]}}, snapshot:None, purged:None, replication:{0:T1-N0.3,1:None,2:None}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:3, last_applied:T1-N0.3, leader:0(since_last_ack:0 ms), membership:{log_id:T1-N0.3, {voters:[{0:()}], learners:[1:(),2:()]}}, snapshot:None, purged:None, replication:{0:T1-N0.3,1:None,2:None}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:3, last_applied:T1-N0.3, leader:0(since_last_ack:0 ms), membership:{log_id:T1-N0.3, {voters:[{0:()}], learners:[1:(),2:()]}}, snapshot:None, purged:None, replication:{0:T1-N0.3,1:T1-N0.3,2:T1-N0.3}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:4, last_applied:T1-N0.3, leader:0(since_last_ack:14 ms), membership:{log_id:T1-N0.4, {voters:[{0:()},{0:(),1:(),2:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.4,1:T1-N0.3,2:T1-N0.3}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:4, last_applied:T1-N0.4, leader:0(since_last_ack:10 ms), membership:{log_id:T1-N0.4, {voters:[{0:()},{0:(),1:(),2:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.4,1:T1-N0.3,2:T1-N0.4}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:5, last_applied:T1-N0.4, leader:0(since_last_ack:19 ms), membership:{log_id:T1-N0.5, {voters:[{0:(),1:(),2:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.5,1:T1-N0.4,2:T1-N0.4}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:5, last_applied:T1-N0.5, leader:0(since_last_ack:9 ms), membership:{log_id:T1-N0.5, {voters:[{0:(),1:(),2:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.5,1:T1-N0.4,2:T1-N0.5}}
report_metrics: Metrics{id:0, Leader, term:1, vote:<T1-N0:Q>, last_log:5, last_applied:T1-N0.5, leader:0(since_last_ack:8 ms), membership:{log_id:T1-N0.5, {voters:[{0:(),1:(),2:()}], learners:[]}}, snapshot:None, purged:None, replication:{0:T1-N0.5,1:T1-N0.5,2:T1-N0.5}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any change to RaftMetrics would result in changed() to be ready, it is possible other fields changed but the heartbeat did not change.
Actually and surprisingly, the interval value in the next loop can even be smaller than the previous one:
$ cargo t t10_server_metrics_and_data_metrics::heartbeat_metrics -- --nocapture
running 1 test
heartbeat: {0: Some(2), 1: Some(9), 2: Some(6)}
heartbeat: {0: Some(2), 1: Some(6), 2: Some(5)}
panicked at tests/tests/metrics/t10_server_metrics_and_data_metrics.rs:111:13:
interval should increase since last ack won't change
tests/tests/metrics/t10_server_metrics_and_data_metrics.rs:111:13
backtrace is disabled without --features 'bt'
test t10_server_metrics_and_data_metrics::heartbeat_metrics ... FAILED
But it looks lite impl Display for RaftMetrics is not updated I can not see the heartbeat field:
Sorry for missing that, it has just been added:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, if another AppendEntries or heartbeat response is received, this interval will become smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heartbeat is disabled in this test cluster, so there shouldn't be heartbeat response. Then I guess it is made by a replication during the cluster initialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this test, do we need more tests or the existing 2 tests are sufficient, let me know what you think:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3, 1 of 1 files at r5, all commit messages.
Reviewable status: 7 of 8 files reviewed, 5 unresolved discussions (waiting on @schreter and @SteveLauC)
openraft/src/metrics/raft_metrics.rs
line 114 at r5 (raw file):
write!( f, "membership:{}, heartbeat:{{{}}}, snapshot:{}, purged:{}, replication:{{{}}}",
The Display implementations should be consistent: always put heartbeat
after replication
.
Sure |
https://github.com/datafuselabs/openraft/actions/runs/10005367869/job/27655934359?pr=1177
This test failure seems irrelevant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 7 files at r2, 1 of 2 files at r3, 1 of 1 files at r4, 1 of 1 files at r5, 1 of 1 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer and @SteveLauC)
openraft/src/metrics/raft_metrics.rs
line 78 at r5 (raw file):
/// Heartbeat metrics. It is Some() only when this node is leader. /// /// This field records a mapping between a node's ID and milliseconds since
I already opened an issue for this and similar topics. Using "millisecond since" or other relative times is unreliable. We should in general use just timestamps via standard Instant
(well, the one from AsyncRuntime
) and Duration
.
There was just a question on how to serialize it for users who need serde
. Possibly, we need a specialized serde
serializer to serialize it properly. See also this issue on serde
.
Code quote:
milliseconds since
openraft/src/metrics/raft_metrics.rs
line 119 at r5 (raw file):
.as_ref() .map(|x| { x.iter().map(|(k, v)| format!("{}:{}", k, DisplayOption(v))).collect::<Vec<_>>().join(",") }) .unwrap_or_default(),
Ouch! Allocations in Display
and similar traits are a very bad idea! Yes, I know there is some legacy stuff with summary()
and similar, but continuing it is not good.
Please rewrite allocation-free, e.g., by adding properly-implemented Display
trait to HeartbeatMetrics
.
Suggestion:
DisplayOption(&self.heartbeat),
openraft/src/metrics/raft_metrics.rs
line 214 at r5 (raw file):
self.heartbeat .as_ref() .map(|x| { x.iter().map(|(k, v)| format!("{}:{}", k, DisplayOption(v))).collect::<Vec<_>>().join(",") })
Same here, though it was broken also for replication
. Feel free to fix it.
tests/tests/metrics/t10_server_metrics_and_data_metrics.rs
line 96 at r4 (raw file):
Hmmm, if another AppendEntries or heartbeat response is received, this interval will become smaller.
That's why use Instant
and not some sort of Duration
...
This would make |
@schreter And GPT-4o has given me a working example of serializing
|
And I am actually not sure if that allocation refactor should be done in this PR, it deserves a separate PR if we want a PR to only does one thing 🤪 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we consider upgrading
Duration
in milliseconds toInstant
in a separate PR to keep the changes in this PR more manageable?
Yes, that's OK. And yes, the suggestion is a similar to one in the serde
issue I linked. But not in seconds, better microseconds :-).
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer and @SteveLauC)
openraft/src/metrics/raft_metrics.rs
line 119 at r5 (raw file):
Previously, schreter wrote…
Ouch! Allocations in
Display
and similar traits are a very bad idea! Yes, I know there is some legacy stuff withsummary()
and similar, but continuing it is not good.Please rewrite allocation-free, e.g., by adding properly-implemented
Display
trait toHeartbeatMetrics
.
@SteveLauC I'd strongly suggest to keep discussions pertaining to code locations in Reviewable - it's much simpler to follow and you can easily track "done" items.
Regarding your question about wrappers, there is no need to make HearbeatMetrics
a wrapper.
But, you can do the same as for DisplayOption
- introduce a helper which wraps the struct only for Display
purposes (DisplayBTreeMap
, for example).
Suggestion:
DisplayOption(self.heartbeat.as_ref.map(DisplayBTreeMap)),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I am actually not sure if that allocation refactor should be done in this PR, it deserves a separate PR if we want a PR to only does one thing 🤪
Since it's broken already for the other metrics, that's OK, but you know, technical debt...
My problem is that openraft
is crashing all our malfunction tests which simulate OOMs (i.e., it is panicking instead of "proper" error handling, bubbling up the error to the caller and correctly shutting down). There will be some serious clean up needed :-). Unfortunately, I can't yet open-source our malfunction testing framework (which would be VERY handy for many library writers).
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer and @SteveLauC)
I just finished that display refactor:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r7, 3 of 3 files at r8, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @SteveLauC)
openraft/src/display_ext/display_btreemap_opt_value.rs
line 28 at r8 (raw file):
} None => { /* do nothing, as map length is 0 */ } }
This snippet could be simplified to:
Suggestion:
let len = self.0.len();
for (idx, (key, value)) in self.0.iter().enumerate() {
write!(f, "{}:{}", key, DisplayOption(value))?;
if idx + 1 != len {
write!(f, ",")?;
}
}
openraft/src/metrics/raft_metrics.rs
line 119 at r8 (raw file):
DisplayOption(&self.snapshot), DisplayOption(&self.purged), DisplayOption(&self.replication.as_ref().map(DisplayBtreeMapOptValueExt::display)),
Given that DisplayBTreeMapOptValue
is already available, you can simplify the expression like this:
DisplayOption(&self.replication.as_ref().map(DisplayBTreeMapOptValue)),
I wrote it in that way so that there won't be a allow unused attribute in display_btreemap_opt_value.rs, only one in display_ext.rs, to be consistent with other display_xxx.rs files, I agree it's kinda weird, can change to approach you suggested if you don't mind another unused attribute. |
I do not quite mind about the unused import :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r8, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @drmingdrmer and @SteveLauC)
openraft/src/display_ext.rs
line 9 at r8 (raw file):
pub(crate) mod display_slice; #[allow(unused_imports)]
I don't quite get it. Why are these imports unused? Are they just written for future use and not really used in the crate (yet)?
A possible path to handle such things might be a prelude
module. Such as, using all these traits via use crate::display_ext::prelude::*;
where all the use statements are pub
-exports of the helpers. If the prelude
itself is pub(crate) mod prelude;
, it's still hidden for the rest of the world, but hopefully you won't get any warnings for unused imports.
openraft/src/display_ext/display_btreemap_opt_value.rs
line 28 at r8 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
This snippet could be simplified to:
Or even further to:
for (idx, (key, value)) in self.0.iter().enumerate() {
if idx > 0 {
write!(f, ",")?;
}
write!(f, "{}:{}", key, DisplayOption(value))?;
}
Done
Done
I am not sure about this as well, I just follow the convention:
They are basically same, so I just use the one suggested by @drmingdrmer. |
When you guys this PR is in its last review round, ping me so that I will squash my commits:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r9, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @SteveLauC)
Gentle ping on @schreter, can I trouble you for another review? |
It's been good enough to me. Please squash the commits and let's merge it. @schreter sometimes got delays for giving feedback. Do not worry. If any refinements are needed, he will provide feedback later soon, and you can update it in the next PR. This has become a routine working pattern:) |
b3a23a8
to
ca1fcfe
Compare
Rust nightly 1.82 renamed Looks like we have to use something like rustversion to handle it? 🤔 Or we can |
It looks like a single |
Let me do this in a separate PR. |
I'm gonna include this fixup in this: |
Feel free to close #1188 if you need to 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gentle ping on @schreter, can I trouble you for another review?
I actually already did, but since I didn't add any additional comments you probably didn't get any mail. I just removed the blocker/marked the Display
formatting as "done" in Reviewable.
The refactoring with Instant
instead of some sort of duration at multiple places is a separate thing, I just added some random comments for that (no blockers).
Reviewed 3 of 3 files at r9, 4 of 4 files at r10, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions
Ahh, sorry for missing that, looks like I really need to learn how to use Reviewable. 😶🌫️ |
Checklist
Will check the above stuff when this PR is no longer a draft:)
What does this PR do
This PR attempts to implement the metric discussed in this thread,
adding the
heartbeat
information (time of the last replication or heartbeat) toRaftDataMetrics
andRaftMetrics
so that users can subscribe to it.It is a draft PR as I want to ensure it looks basically correct, then I will
try to add tests, code documents, and polish interfaces.
How it is implemented
For the time type, I used
RaftTypeConfig::AsyncRuntime::Instant
as it is usedwidely.
And I found the last applied log ID is stored in
struct ProcessEntry
, so Istored this time information there as well. When a response to a replication
request is received, the replication handler will update it in
ReplicationHandler::update_matching()
using thesending_time
of the replicationresult.
And these metrics will be sent to the tokio watch channel in
RaftCore::report_metrics()
.Unsolved problems/Questions
There are several unsolved problems with this PR, which I want to discuss with
the maintainer before taking any further actions.
The
serde::{Deserialize, Serialize}
boundsRaftDataMetris
andRaftMetrics
need to satisfy these bounds when theserde
feature of Openraft is enabled, but neither
std::time::Instant
nortokio::time::Instant
does this.Should we use the
result.sending_time
, or we should let the follower passthe time when it receives the replication request? The later seems more
accurate.
Tests
Looks like most unit tests (related to replication) use special human-made
log ID to emulate specific cases, should this time info be done similarly?
I guess no as it is not that important compared to the log ID.
This change is