Chunkyard is a fast backup application for Windows and Linux which stores files in a content addressable storage with support for dynamic chunking and encryption.
The FastCDC chunking algorithm is a C# port of the Rust crate fastcdc-rs.
I am building Chunkyard for myself. You might want to consider more sophisticated tools. Here’s a list of options.
You can install Chunkyard by:
- Downloading a release (no dependencies required)
- Installing it as a dotnet tool
- Cross platform support. Chunkyard is shipped as two binaries
chunkyard
(Linux) andchunkyard.exe
(Windows) and they work without having to install .NET on your computer - Favor simplicity and readability over features and elaborate performance tricks
- Strong symmetric encryption (AES Galois/Counter Mode using a 256 bit key)
- Ability to copy from/to other repositories
- Verifiable backups
- No third-party dependencies
- Key management
- Asymmetric encryption
- Compression
- Extended file system features such as OS specific flags or links
- Extended “version control” features such as branching or tagging
- Hiding chunk sizes to prevent CDC fingerprint attacks
- Concurrent operations on a single repository using more than one Chunkyard process (e.g. creating a new backup while garbage collecting unused data)
- Install the .NET SDK
- Run
dotnet build src
anddotnet test src
to build and test Chunkyard
- Optional: Create an annotated tag to define a new version number
- Run
./publish
to create Linux and Windows binaries in theartifacts
directory
Type chunkyard help
to see a list of all available commands. You can add
--help
to any command to get more information about what parameters it
expects.
Example:
# List all available commands
chunkyard
# Learn more about the store command
chunkyard store --help
# See which files chunkyard would backup
chunkyard store --repository 'MyBackup' --path 'Music' 'Pictures' 'Videos' --dry-run
# Store a backup
chunkyard store --repository 'MyBackup' --path 'Music' 'Pictures' 'Videos'
# Check if the latest backup is valid
chunkyard check --repository 'MyBackup'
# Restore parts of the latest backup
chunkyard restore --repository 'MyBackup' --directory '.' --include 'mp3$'
# Keep the latest four backups
chunkyard keep --repository 'MyBackup' --latest '4'
You can find examples of how I use Chunkyard in my dotfiles.
- Blob: Binary data (e.g. the content of a file) with some meta data
- Snapshot: A set of BlobReferences. It describes the current state of a set of Blobs at a specific point in time
- Repository: A store which Chunkyard uses to persist data
- Chunk: An encrypted piece of a Blob or a Snapshot
- Chunk ID: A hash address which can be used to retrieve Chunks
- BlobReference: Contains Chunk IDs and meta data which can be used to restore a Blob
- SnapshotReference: Contains Chunk IDs and meta data which can be used to restore a Snapshot
These classes contain the most important logic:
- IRepository.cs: Defines the underlying backup storage
- IBlobSystem.cs: Provides an abstraction to read and write Blobs
- SnapshotStore.cs: Chunks, encrypts, deduplicates and stores Blobs in an IRepository
- Take a set of files
- Split files into encrypted chunks, store them in a repository and return a list of BlobReferences
- Bundle all BlobReferences into a Snapshot, store this Snapshot as encrypted chunks and return a SnapshotReference
- Retrieve a Snapshot using a SnapshotReference
- Retrieve, decrypt and reassemble all files using their BlobReferences of the given Snapshot