-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
84 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,78 @@ | ||
* [readme 中文](./README_cn.md) | ||
* [readme EN](./README.md) | ||
* [readme EN](./README.md) | ||
|
||
## Overview | ||
bdchecker (**B**ackup **D**ata Checker) is a tool for checking personal cold backup data, helping you discover data corruption in time | ||
|
||
## Why use it | ||
Imagine that we have some data that needs to be cold backup, which may be the raw market data of a certain financial market that is compressed every day; or some electronic versions of classic movies owned by individuals; or some keys that are not used all year round; Let’s first list some options: | ||
|
||
| Storage plan | Storage life span | | ||
| ---- | ---- | | ||
| SSD | several years to more than ten years | | ||
| HDD | 10+ years | | ||
| tape drive | 30+ years | | ||
| punched paper | thousand of years | | ||
| Carved in stone (Luo Ji raised his crutch above his head and shouted solemnly) | millions of years | | ||
|
||
There is no doubt that if you have enough financial resources to engrave the information on stone and store it properly, it should be very safe unless you are attacked by a dual-vector foil attack; but for individuals, the cost of reading information from stones should be far greater than the value of the data we need to save. | ||
So when considering the ease of reading and writing of data, there is no doubt that the hard disk is the most convenient; but this brings additional requirements, that is, we need to regularly check whether the data is corrupted, this is the reason why use **bdchecker** | ||
|
||
## Install | ||
* use pip | ||
``` | ||
pip install bdchecker | ||
``` | ||
* download from project's [Releases](https://github.com/MuggleWei/bdchecker/releases), and decompress | ||
|
||
## Usage | ||
**bdchecker** include 3 sub-command | ||
* gen: scan directory, recursively traverse to generate the hash information of all **new** files in the directory, and place them in the `.bdchecker.meta` folder. | ||
* clean: scan directory, clean deleted files from hash information | ||
* check: scan directory, Find corrupted files (note that this operation will calculate the hash value of all files, which is more time-consuming) | ||
|
||
### Example directory | ||
Assume that we currently have the following directory structure | ||
``` | ||
data | ||
├──── a.txt | ||
├──── b.txt | ||
└──── c | ||
├──── c1.txt | ||
└──── c2.txt | ||
``` | ||
|
||
### Command: gen | ||
Generate hash infos | ||
``` | ||
bdchecker gen -d data -v 1 | ||
``` | ||
* `-d`: directory for which information needs to be generated | ||
* `-v`: verbose level | ||
|
||
After missiong completed, you can see console output: `dump meta info to data/.bdchecker.meta/sha256.csv` | ||
When there are no new files in the directory, repeatedly executing the `gen` command will not actually generate the hash information of the file. | ||
|
||
### Command: clean | ||
remove `data/c/c2.txt`, then run | ||
``` | ||
bdchecker clean -d data -v 1 | ||
``` | ||
You can see in the last few lines of the log: `clean missing file's meta info: c/c2.txt`, which means that we have successfully cleaned the hash information corresponding to the file. | ||
|
||
### Command: check | ||
run | ||
``` | ||
bdchecker check -d data -v 1 | ||
``` | ||
The last line of the log appears: `all check pass`, which means there are no new/deleted files and all files are not corrupted. | ||
|
||
Now, let's modify `a.text`, write something randomly, and then run again | ||
``` | ||
bdchecker check -d data -v 1 | ||
``` | ||
At this time, an error message appears in the log: `check failed: a.txt, old hash: ..., cur hash: ...`, indicating that the content of `a.txt` has changed. | ||
|
||
### Migration and comparison | ||
The hash information generated by `bdchecker` will be saved in the `.bdchecker.meta` in the directory, so you can directly migrate the entire folder during migration. | ||
When there are already multiple backup data and no hash value has been generated; at this time, you can use the `bdchecker gen` command to generate a hash value for each backup data, and then compare the two files. Since the generated file lines are already sorted, so you can directly use commands such as `diff` for comparison. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters