-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encryption: support renaming encrypted directory #14095
Conversation
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Still a draft, opinions welcome. cc @BusyJay @tonyxuqqi |
5a89211
to
c0d3818
Compare
self.dir_dict | ||
.lock() | ||
.unwrap() | ||
.get((dir_id, relative_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(dir_id, relative_path) does not match dir_dict's definition HashMap<u64, ..>
@@ -347,6 +404,29 @@ impl Dicts { | |||
}; | |||
self.rotate_key(key_id, data_key, master_key) | |||
} | |||
|
|||
fn rename_dir(&self, src_name: &str, dst_name: &str) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the dir_dict needs to be updated as well.
For example, when a snapshot is applied, the tablet is renamed from tablet_snapshot/123_5_tmp to tablets/123_5.
All the keys in the dir_dict contains the prefix 123_5_tmp.
Now after the rename, we're still keeping the 123_5_tmp in the dir_dict.
But when the caller try to decrypt a SST under 123_5, the path will be 123_5/xxx.sst and it won't find the match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because "tablet_snapshot/123_5_tmp" is a first level directory under the whitelist, it will be encoded using dir ID. So the data keys are all stored using (dir_id, xx.sst). This way the dir renaming only needs to change the dir_id mapping.
Signed-off-by: tabokie <[email protected]>
c0d3818
to
c3eb782
Compare
// makes it possible to rename directory without modifying records of every file. | ||
// | ||
// The reasons for having a whitelist instead of treating all directories this way: | ||
// (1) keep v1 data keys file backward compatible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to consider compatibility for now (may not even in the future). Simple is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about v1 data, because v1 and v2 share the implementation of DataKeyManager
. To implement a completely new DataKeyManagerV2
is not economical because not all files should be encoded with dir id (for the reason (2)), they should fallback to the old format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because not all files should be encoded with dir id
Can you give some examples that will not be encoded with dir id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local backup export file to an arbitrary user-specified folder; nested folder such as titandb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tabokie DataKeyManager is referenced in many codes , so I agree it's better to keep DataKeyManager as an entry struct for encryption/decryption. However, I also think we can simply copy all the current implementation for v1, just wrap under DataKeyManager. And the new implementation can only take effect on v2. It may simplify the logic.
Otherwise it may introduce regression.
Maybe in the future once the new implementation is battle proved, then we unify the two implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonyxuqqi Missed your comment yesterday. Implementing two managers looks really bad, you have to trait them because there're so many places use them. But I think it's doable to separate the v2 functionality into a new plugin struct, i.e. if the whitelist is empty (for v1), the plugin is None and we don't create the new key file etc.
…rypt-rename Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
enum LogRecord { | ||
Insert(FileInfo), | ||
Remove, | ||
pub trait DictionaryItem: Sized + Clone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments for 'DictionaryItem' and 'ProtobufDictionary'.
@@ -32,6 +32,7 @@ pub fn new_file_security_config(dir: &tempfile::TempDir) -> EncryptionConfig { | |||
file_dictionary_rewrite_threshold: 100000, | |||
master_key: master_key_cfg.clone(), | |||
previous_master_key: master_key_cfg, | |||
v2_directory_whitelist: Vec::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if there's one v2's folder missing from this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) The keys cannot be found, and (2) this directory cannot be renamed (the rename will raise an error). (1) shouldn't really happen unless you change the whitelist half way. The plan is to hardcode it and forbid configuration.
…rypt-rename Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
How about making rename as a call of |
How? In specific how are |
Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
Signed-off-by: tabokie <[email protected]>
/release |
@tabokie: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…h importing data keys (#14583) ref #12842, ref #14095, ref #14097 support renaming encrypted dir (inefficiently) and batch importing data keys Signed-off-by: tabokie <[email protected]>
…h importing data keys (tikv#14583) ref tikv#12842, ref tikv#14095, ref tikv#14097 support renaming encrypted dir (inefficiently) and batch importing data keys Signed-off-by: tabokie <[email protected]>
…h importing data keys (tikv#14583) ref tikv#12842, ref tikv#14095, ref tikv#14097 support renaming encrypted dir (inefficiently) and batch importing data keys Signed-off-by: tabokie <[email protected]> Signed-off-by: CalvinNeo <[email protected]>
…h importing data keys (tikv#14583) ref tikv#12842, ref tikv#14095, ref tikv#14097 support renaming encrypted dir (inefficiently) and batch importing data keys Signed-off-by: tabokie <[email protected]> Signed-off-by: lidezhu <[email protected]>
/close |
@tonyxuqqi: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: tabokie [email protected]
What is changed and how it works?
Issue Number: Ref #12842
What's Changed:
Introduce a new format to store data keys for encrypted files. The old format stores mapping from file path to encryption key. The new format stores two mappings: directory path to directory ID, (dir ID + relative file path) to encryption key.
Only file path matching certain patterns will be encoded using the new format. So the change should be backward compatible for v1 architecture.
Compatibility with RocksDB:
(1) RocksDB uses
Env::RenameFile
to rename directory. We usesLinkFile
+DeleteFile
under the hood for atomicity. To be compatible,link_file
anddelete_file
needs to detect if the path is a file or directory.(2) We don't manage
Env::CreateDir
call from RocksDB. Instead the directory ID is only allocated when a file is created under the dir.(3) We don't manage
Env::DeleteDir
call from RocksDB. Needs to callKeyManager::delete_dir
manually.Other options:
Related changes
pingcap/kvproto#1098
Check List
Tests
Release note