-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor snapshot #896
Refactor snapshot #896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for sharing this draft design; it provides a clear picture.
The traits SnapshotManifest
and SnapshotData
seem to exhibit similar or
overlapping functionalities. IMHO, merging these two into a
single trait could streamline the code and eliminate redundancy.
Additionally, RaftTypeConfig
appears to include three unnecessary type
definitions. A more efficient approach could be to associate these types with
the SnapshotData
trait instead:
// Remove extraneous type declarations from RaftTypeConfig
trait RaftTypeConfig {
// Other types...
type SnapshotData: SnapshotData;
}
trait SnapshotData {
// Other associated types...
type SnapshotChunkId;
type SnapshotChunk;
type SnapshotManifest;
}
Opinions?
Reviewed 1 of 17 files at r1.
Reviewable status: 1 of 17 files reviewed, all discussions resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinions?
I'd also prefer to have associated types with SnapshotData
to prevent type explosion in RaftTypeConfig
:-).
Regarding manifest vs. data types, I don't know whether it's better to have them same or different types.
Reviewed 13 of 17 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @zach-schoenberger)
openraft/src/raft_types.rs
line 1 at r2 (raw file):
use std::fmt::Debug;
Seems unused
openraft/src/type_config.rs
line 60 at r2 (raw file):
type Entry: RaftEntry<Self::NodeId, Self::Node> + FromAppData<Self::D>; type SnapshotChunkId: Eq + PartialEq + Send + Sync + Display + Debug + OptionalSerde + 'static;
As suggested above, use these as associated types from SnapshotData
...
openraft/src/type_config.rs
line 69 at r2 (raw file):
/// See the [storage chapter of the guide](https://datafuselabs.github.io/openraft/getting-started.html#implement-raftstorage) /// for details on where and how this is used. type SnapshotData: SnapshotData<ChunkId = Self::SnapshotChunkId, Manifest = Self::SnapshotManifest, Chunk = Self::SnapshotChunk>
...then you also don't need to explicitly reference them here.
Instead, you'd reference C::SnapshotData::ChunkId
and the like elsewhere (associated types are already introduced, so why not use them directly?).
openraft/src/core/snapshot_state.rs
line 8 at r2 (raw file):
#[derive(Debug, Clone)] #[derive(PartialEq, Eq)] pub(crate) struct SnapshotRequestId<NID: NodeId, C> {
C
is a bit non-descriptive and normally used for type config. I suggest to use the same name (or at least a more descriptive one) for generics as in the associated type, i.e., ChunkId
.
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
None => { tracing::error!("should never happen"); false
Better error handling?
It should probably return an unexpected
error or so. Also the message is too non-descriptive.
Code quote:
tracing::error!("should never happen");
false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @schreter)
openraft/src/raft_types.rs
line 1 at r2 (raw file):
Previously, schreter wrote…
Seems unused
It wasn't showing as an unused import, but will double check.
openraft/src/type_config.rs
line 69 at r2 (raw file):
Previously, schreter wrote…
...then you also don't need to explicitly reference them here.
Instead, you'd reference
C::SnapshotData::ChunkId
and the like elsewhere (associated types are already introduced, so why not use them directly?).
Yep. I like the idea. I'll update this setup.
openraft/src/core/snapshot_state.rs
line 8 at r2 (raw file):
Previously, schreter wrote…
C
is a bit non-descriptive and normally used for type config. I suggest to use the same name (or at least a more descriptive one) for generics as in the associated type, i.e.,ChunkId
.
Good point. It should really be the RaftTypeConfig and replace both generics.
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, schreter wrote…
Better error handling?
It should probably return an
unexpected
error or so. Also the message is too non-descriptive.
This was just a place holder. The error system is pretty rigid so I hadn't had a chance to figure out what should be used. I'll look into what makes sense..
@drmingdrmer I originally tried to keep the This setup views the manifest as a light weight entity that contains only the minimal amount of data to know if a snapshot is complete or not. If we combined the traits, it would be the same as combining these structs:
So what happens is that the Of course I could be completely missing something. So any ideas on how to address that issue are welcome :) |
I agree. And I found that there is no need to introduce The only thing an application needs is pub trait SnapshotData: Send + Sync {
type Chunk: SnapshotChunk;
type ChunkIter: Iterator<Item=Self::Chunk>;
// For leader to get all chunks to send
async fn chunk_iter(&self) -> Result<Self::ChunkIter, AnyError>;
// For follower to receive a chunk and return `true` when finished.
async fn receive(&mut self, c: Self::Chunk) -> Result<bool, AnyError>;
} The application uses an enum of enum AppChunk {
Data(D),
Manifest(M),
} Make sense? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense?
From my PoV, that should be sufficient.
The only issue I see that this needs an async iterator for ChunkIter
, not a synchronous one - you don't want to force reading all the data needed for the snapshot at the time of chunk_iter()
call (OTOH, the Chunk
could be also just some metadata object which will read the data at the time of sending them, which is async).
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @zach-schoenberger)
The suggestion above is essentially what I did the first time when using a Of course I would still love to make this super simple and put all this logic on the API user. Just have the network interface have a simple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we just return the iterator, then it really complicates the retry.
You are right, retry is an important topic.
Of course I would still love to make this super simple
Mhm... What about initializing the iterator with chunk ID? I.e., something on these lines (not syntactically correct):
pub trait SnapshotData: Send + Sync {
type Chunk: SnapshotChunk;
type ChunkIter: Iterator<Item=impl Future<Output = Self::Chunk>>;
async fn chunks_to_send(&self, restart_marker: Option<Self::Chunk::ChunkId>) -> Result<Self::ChunkIter, AnyError>;
async fn receive_chunk(&mut self, c: Self::Chunk) -> Result<bool, AnyError>;
}
where
trait SnapshotChunk {
type ChunkId: Sized + Send;
fn id(&self) -> Self::ChunkId;
}
When the sending fails, you can request all chunks starting from the chunk ID of the last chunk ID confirmed by the receiver.
BTW, I don't particularly like having chunk receive on SnapshotData
. I'd suggest to move it to a dedicated SnapshotReceiver
trait and make it convertible into SnapshotData
after all chunks are received. I.e., you'll start receiving into a new SnapshotReceiver
and convert to (immutable) SnapshotData
afterwards, keeping SnapshotData
immutable at all times.
Otherwise, what would happen if by mistake someone calls SnapshotData::receive_chunk
?
Opinions?
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @zach-schoenberger)
@zach-schoenberger @schreter I was expecting a // Pseudo code for the leader to send snapshot.
fn leader_sending_snapshot() {
let snapshot = self.get_current_snapshot()
let manifest: MyManifest = snapshot.manifest();
// Use a Manifest for tracking progress.
// The leader expects the receiver to respond with its already received Manifest.
let received: MyManifest = self.send_install_snapshot(InstallSnapshotRequest::Manifest(manifest.clone())).await?;
// When retrying sending chunks, pass in a partially finished Manifest:
let chunks = manifest.chunks_to_send(received.clone());
for c in chunks {
self.send_install_snapshot(InstallSnapshotRequest::Chunk(c)).await?;
// Track the progress if needed
// received.add(c.chunk_id()).await;
}
}
|
OK for me. |
@drmingdrmer Awesome, that's exactly how I implemented the manifest in the PR. I'm not seeing any differences API wise above from the PR besides adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 17 files at r1, 1 of 7 files at r3, all commit messages.
Reviewable status: 12 of 18 files reviewed, 6 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/streaming_state.rs
line 24 at r3 (raw file):
pub(crate) streaming_data: Box<RTCSnapshotData<C>>, pub(crate) manifest: RTCSnapshotManifest<C>,
Is there an assumption that Manifest
can be retrieved from SnapshotData
?
IMHO, Manifest
is part of SnapshotData
.
If it is, there is no need to store a separate Manifest
.
Code quote:
pub(crate) streaming_data: Box<RTCSnapshotData<C>>,
pub(crate) manifest: RTCSnapshotManifest<C>,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r3, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @zach-schoenberger)
openraft/src/type_config.rs
line 56 at r3 (raw file):
type Entry: RaftEntry<Self::NodeId, Self::Node> + FromAppData<Self::D>; // type SnapshotChunkId: Eq + PartialEq + Send + Sync + Display + Debug + OptionalSerde + 'static;
Don't forget to remove commented-out code in the final change.
openraft/src/type_config.rs
line 75 at r3 (raw file):
pub type RTCSnapshotChunkId<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::ChunkId; pub type RTCSnapshotChunk<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::Chunk; pub type RTCSnapshotManifest<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::Manifest;
Are these aliases relevant also for the implementor of the code? Probably less so.
Maybe make crate-private?
If they indeed make sense for public use (which I doubt), then at least they need appropriate doc comments.
BTW, you used these aliases in a few API functions. I'd suggest to use the full type instead, since it will be confusing to the user, I think. The full type is clear.
Suggestion:
pub(crate) type RTCSnapshotData<C> = <C as RaftTypeConfig>::SnapshotData;
pub(crate) type RTCSnapshotChunkId<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::ChunkId;
pub(crate) type RTCSnapshotChunk<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::Chunk;
pub(crate) type RTCSnapshotManifest<C> = <<C as RaftTypeConfig>::SnapshotData as SnapshotData>::Manifest;
openraft/src/raft/message/client_write.rs
line 3 at r3 (raw file):
use std::fmt::Debug; #[cfg(feature = "serde")] use crate::AppDataResponse;
This change doesn't seem right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @schreter)
openraft/src/type_config.rs
line 75 at r3 (raw file):
Previously, schreter wrote…
Are these aliases relevant also for the implementor of the code? Probably less so.
Maybe make crate-private?
If they indeed make sense for public use (which I doubt), then at least they need appropriate doc comments.
BTW, you used these aliases in a few API functions. I'd suggest to use the full type instead, since it will be confusing to the user, I think. The full type is clear.
You are right, they likely do not need to be exposed outside of the crate. Although i would argue it is a lot easier to read/understand what the API functions are receiving using them than the expanded form. The full type is just extremely verbose.
openraft/src/raft/message/client_write.rs
line 3 at r3 (raw file):
Previously, schreter wrote…
This change doesn't seem right.
when looking at the rest of the code, this import is only used when serde
is configured. it generates an unused import warning otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zach-schoenberger You might want to close comments you don't want to address and address open comments you want to address. GitHub doesn't give you the visibility - I'd suggest you use Reviewable to see immediately what's still open.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer and @zach-schoenberger)
openraft/src/type_config.rs
line 75 at r3 (raw file):
Previously, zach-schoenberger wrote…
You are right, they likely do not need to be exposed outside of the crate. Although i would argue it is a lot easier to read/understand what the API functions are receiving using them than the expanded form. The full type is just extremely verbose.
Well, in the end, up to you. I'd still go with verbose type for APIs :-).
openraft/src/core/streaming_state.rs
line 24 at r3 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Is there an assumption that
Manifest
can be retrieved fromSnapshotData
?
IMHO,Manifest
is part ofSnapshotData
.If it is, there is no need to store a separate
Manifest
.
I don't know whether this is an almost-done PR, but if it is, it would be nice to have also doc comments on newly-added items explaining their purpose (and fix comments on the updated ones - like above on snapshot_meta
).
openraft/src/raft/message/client_write.rs
line 3 at r3 (raw file):
Previously, zach-schoenberger wrote…
when looking at the rest of the code, this import is only used when
serde
is configured. it generates an unused import warning otherwise.
Ah, OK, that's another story.
But, a) it's unrelated, so you might want to push a separate single-line PR and b) the formatting is likely off (I suppose, the use
statement should be on a separate line - but not 100% sure about current rustfmt
settings). Do you have rustfmt
on save activated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/raft/message/client_write.rs
line 3 at r3 (raw file):
Previously, schreter wrote…
Ah, OK, that's another story.
But, a) it's unrelated, so you might want to push a separate single-line PR and b) the formatting is likely off (I suppose, the
use
statement should be on a separate line - but not 100% sure about currentrustfmt
settings). Do you haverustfmt
on save activated?
Because I added inline_attribute_width = 80
to rustfmt.toml
. In order to reduce number of lines. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @drmingdrmer and @zach-schoenberger)
openraft/src/raft/message/client_write.rs
line 3 at r3 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Because I added
inline_attribute_width = 80
torustfmt.toml
. In order to reduce number of lines. 🤔
OK, it just looks strange to me, since I've never seen it like that in another project 🙂. I wouldn't be worried about number of lines, since whether it's space or linefeed, the compiler has exactly the same work to do. The human, however, has somewhat less cognitive load if it's split. But, it's up to you 🙂.
Since there are a lot of tests and example crates in this repo that use |
Yes, it would be great if the changes required to an application is minimized. 👍 And from my perspective, the |
I wouldn't want to do it for |
Any update on this one? |
@zach-schoenberger |
Sorry just saw this. I got pretty busy and didn't have a chance to work on this. I'll try to finish up now. |
Hi @drmingdrmer, i figure it might be quicker to just ask first about a test failure I'm seeing. One of the tests for snapshot install is failing from what looks like a race condition between the assert and the snapshot installation. Here are the logs:
I'm not sure why the changes would cause the snapshot install to take longer... or if it some other aspect of how the test is setup to look for the changes. If it's something that jumps out at you please let me know, otherwise I'll keep poking around to see what I can find. EDIT: |
@zach-schoenberger 👉🏻1 Update in-memory state when installing snapshot: 👉🏻2 Metrics is updated to the in-memory state. At this time point, the install-snapshot command may not have reached The solution should be split the snapshot state into two: the expected state when all io-commands are executed and the actual state of the Let me fix this first |
002d6fa
to
4fd6452
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 33 files reviewed, 3 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/type_config.rs
line 75 at r3 (raw file):
Previously, zach-schoenberger wrote…
I've used other crates that follow this kind of pattern to make it easier to use the crate.
rkyv
being the main one I can think of now. Otherwise users will generally get it wrong at first and then just guess and check what the compiler wants (at least thats what even I had to do when making these changes). I find them a pretty big quality of life improvement. But I'll use whatever you and @drmingdrmer would prefer.
Yes these aliases are life changer. I use such aliases in my repo too. And my alias does not have type parameter C
because in my repo C
is fixed.
Therefore I'd prefer having them private too. Application developers may have their own simpler way defining alias.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you man!
I have thoroughly reviewed the changes and overall, they are quite good.
However, before we go into a detailed discussion, it would be beneficial to first resolve the ongoing discussions that were started during previous reviews.
Reviewed 3 of 15 files at r4, 3 of 12 files at r6, 5 of 11 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: 18 of 33 files reviewed, 3 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
This was just a place holder. The error system is pretty rigid so I hadn't had a chance to figure out what should be used. I'll look into what makes sense..
If it receives a chunk before receiving a Manifest, it is a protocol error, in which it should notify the sender about this error. And in order to receive such an error, in the receive-chunk response: sm::Response::ReceiveSnapshotChunk(meta)
, meta
should be a Result<Option<_>, E>
.
Previously, schreter wrote…
I'm really not sure what approach makes the most sense here. It seems really cumbersome to have the user impl return the snapshotdata for population that would also then have its own manifest (since manifest is part of snapshotdata), that we have to overwrite with the one sent over the wire. Also if manifest were a part of the snapshot, it would make more sense to let the calls to snapshotdata::receive call the manifest::receive but that lives in user land. I could see people not reading the docs and using it incorrectly. If you really want to combine them I can, just for my mental model I like having them separate. |
Previously, drmingdrmer (张炎泼) wrote…
These are private (like in the suggestion) in the current state of the PR. |
Previously, drmingdrmer (张炎泼) wrote…
Sorry, I'm not sure I follow. The return type is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/streaming_state.rs
line 24 at r3 (raw file):
Previously, zach-schoenberger wrote…
I'm really not sure what approach makes the most sense here. It seems really cumbersome to have the user impl return the snapshotdata for population that would also then have its own manifest (since manifest is part of snapshotdata), that we have to overwrite with the one sent over the wire.
Also if manifest were a part of the snapshot, it would make more sense to let the calls to snapshotdata::receive call the manifest::receive but that lives in user land. I could see people not reading the docs and using it incorrectly. If you really want to combine them I can, just for my mental model I like having them separate.
As far as I see in your design, if Manifest
and SnapshotData
are separated, when all chunks are received and it is about to install the snapshot, Openraft must pass the Manifest
to RaftStateMachine::install_snapshot()
.
Correctness relying on user implementation to call Manifest::receive()
in SnapshotData::receive()
is unacceptable to me either.
If there is an example implementation of the new snapshot API, it would be more clear.
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
Sorry, I'm not sure I follow. The return type is
Result<Option<SnapshotMeta<C::NodeId, C::Node>>, StorageError<C::NodeId>>
. None of theStorageError
Violation
options available make much sense for this state (assuming using a defensive error makes sense). Should I be adding a new one?
My apology. It should have already checked whether a manifest is already received, in FollowerHandler::receive_snapshot_chunk
.
Then if self.streaming
is None, it's an severe bug, and a panic()
would be more appropriate than just an error logging.
Previously, drmingdrmer (张炎泼) wrote…
Sorry I think this was an old comment thread. I don’t have any updates in mind. Sorry for the confusion. |
Previously, drmingdrmer (张炎泼) wrote…
Awesome. Panic makes sense to me. |
Previously, zach-schoenberger wrote…
I updated and ran locally having this panic, but it broke the testing setup in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
I updated and ran locally having this panic, but it broke the testing setup in
t10_api_install_snapshot::snapshot_argument
. This test seems to be testing exactly this scenario: sending chunks before a manifest. I think it's still a valid test, but not sure how we would want to structure it with a panic.
Receiving a chunk before manifest is a protocol error and the client(leader) should receive an error response.
Before this PR, such checking is done in Engine
, i.e., by FollowingHandler::receive_snapshot_chunk()
.
At the IO level, i.e., in this method: sm::Worker::receive_snapshot_chunk()
, it did not handle protocol error.
I think there are 2 options for this situation.
- You can just put the manifest to the engine level and let the FollowingHandler handle such error.
- Or just let this method to be able to return a protocol error.
But before this, I still don't know if such a design is suitable to implement a snapshot data that has the leveldb structure.
Previously, drmingdrmer (张炎泼) wrote…
Which aspect of the design? I'm definitely planning to use this for rocksdb. My high level plan is to have the manifest be the list of a checkpoing/backup's files and then each chunk just be a file name which triggers the follower to download the file from the snapshot provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
Which aspect of the design? I'm definitely planning to use this for rocksdb. My high level plan is to have the manifest be the list of a checkpoing/backup's files and then each chunk just be a file name which triggers the follower to download the file from the snapshot provider.
If you are using Rocksdb checkpoint files, is there a manifest file stored on disk which maintains the list of checkpoint files?
- If such a file exists, in this design, there is no step that presist the manifest file to disk.
- If no such a file exists, ths snapshot can not be consistently reconstructed in the presence of dirty files, such as unfinished downloaded temporary files.
If the manifest is part of the snapshot data. It's necessary to incorporate a
step that allows the application to persist manifest.
On the other hand, the application doesn't have a snapshot manifest,
this would constitute a "no manifest" design, implying that the manifest is
constructed on the fly using snapshot data.
Previously, drmingdrmer (张炎泼) wrote…
The manifest is not part of the snapshot data here. The "on the fly" would be closer. Checkpoints hard link or copy the files to create a full representation of the state at time of snapshot. Either that succeeds at which point you have a directory and list of files, or it doesn't and thats an error state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
- no such file exists. there is no need for there to be one
- the assumption it cannot be consistently reconstructed is not accurate. files in the directory are all that is needed. the manifest aka list of files can be generated cheaply on the fly.
The manifest is not part of the snapshot data here. The "on the fly" would be closer. Checkpoints hard link or copy the files to create a full representation of the state at time of snapshot. Either that succeeds at which point you have a directory and list of files, or it doesn't and thats an error state.
In the context you've provided, a snapshot is essentially a collection of data files within the directory.
Snapshot gets updated when logs are applied to the state machine, which involves the addition of new files into this directory.
However, without a manifest file, it outlines the file list of the snapshot. It's impossible to perform an atomic update.
Consider a situation where you are adding to files. If the system crashes after
the first file is added and before the second one gets added, then snapshot
directory will end up in an inconsistent state.
Previously, drmingdrmer (张炎泼) wrote…
Sorry, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
Sorry, but
Snapshot gets updated when logs are applied to the state machine, which involves the addition of new files into this
doesn't make any sense. That's not what a snapshot is by definition. Snapshots should be immutable. The statemachine can be replaced/overwritten with a snapshot. The statemachine can have logs applied to it. But that is not the purpose of a snapshot. The situation that you bring up should never be the case because of this.
I mean, when new logs are applied, and by some policy, a new snapshot should be build from the latest state machine.
Snapshot is immutable but it can be built upon a previous snapshot.
Previously, drmingdrmer (张炎泼) wrote…
I view this
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 33 files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, zach-schoenberger wrote…
I view this
Snapshot is immutable but it can be built upon a previous snapshot.
as:
- follower receives snapshot
- follower applies snapshot to statemachine (in most cases replacing current statemachine)
- now there are 2 distinct sets of data - statemachine and snapshot
- new logs applied to statemachine
Is this not how the raft system views these entities?
When a follower receives a snapshot, it should replace both its state machine and its local snapshot with the received one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zach-schoenberger Thanks for the good work and sorry for the delay in reviewing. I currently don't have any further comments except the one below, so from my POV it's good to go. We'll see later, how it works in real life.
Reviewed 5 of 15 files at r4, 11 of 12 files at r6, 10 of 11 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @drmingdrmer and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
When a follower receives a snapshot, it should replace both its state machine and its local snapshot with the received one.
On this one, I still don't know what's the best course of actions. Probably just fail installing the snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drmingdrmer You are blocking the change. What is still open from your POV? I'd like to try to integrate it with our codebase "soon" (in a week or two, hopefully), so it would be nice to have the change in.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @drmingdrmer and @zach-schoenberger)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm know, a typical design for snapshot implementation mirrors that of leveldb, which involves several data files on the disk along with a manifest
file. Modifications to the database occur through the addition of new data files followed by an atomic update of the manifest
.
However, in this pull request, there is no provided API that allows the application to modify the manifest
. It operates under the assumption that the manifest
is transient and does not need to be stored persistently.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @schreter and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
Probably just fail installing the snapshot?
@schreter
I do not get it. Why does it have to fail? When a snapshot is received, replacing state machine with the snapshot is just the course of action.
- now there are 2 distinct sets of data - statemachine and snapshot
- new logs applied to statemachine
Is this not how the raft system views these entities?
@zach-schoenberger
Yes they are different, but the data in state machine is always a superset of the data snapshot: snapshot ⊆ state_machine
.
@zach-schoenberger I will continue working on this based on your PR, in another PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @drmingdrmer and @zach-schoenberger)
openraft/src/core/sm/mod.rs
line 285 at r2 (raw file):
I do not get it. Why does it have to fail? When a snapshot is received, replacing state machine with the snapshot is just the course of action.
Sorry for the late reply. What I have in mind, what should happen, for example, on resource shortage (memory, disk, etc.). But from the other discussion I take that this will simply result in the I/O error propagating to the raft core and shutting down the engine.
Closing this PR for this issue being resolved in:
Openraft does not need to understand Manifest or participate in the receiving process. These parts of job are application specific. |
Checklist
This change is