-
-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repository - going away from transactions, log, refcounts? #7377
Comments
alternative implementationWe'll need a key/value store, of course. But maybe we could go away from transactions and LOG-like behaviour and rather use garbage collection (gc) to clean up anything that is not referenced (either because it is not used any more, because all references got deleted or because it was never used, because something crashed before the action was completed). We could delegate a lot of stuff borg used to do with own code to the filesystem:
For speed reasons, the borg client could still have a chunks index, e.g. Why/how we could live without transactionsborg create
borg delete / borg prune
borg gc (maybe as part of borg check?)
advantages
disadvantages
|
refcounts discussion moved to #7379. |
Extra consideration could be done for something like lmdb or git style packs |
I think duplicacy uses this file system based approach. They managed to implement the GC part to be lock free too, relying on atomic properties of the used file system |
Indeed. What's missing from duplicacy is the capacity to fuse-mount backups. There have been a couple of PR's but apparently their performance is poor. It's the only reason I haven't been using it. |
Updates: In my experimental branch, I implemented a new Repository class:
|
Hmm, thinking about whether we still need checkpoint archives... borg 1.x rolled back the repo if there was no commit, thus we saved/committed checkpoint archives to save progress in case borg create gets interrupted by having the checkpoint archive reference chunks and having an intermediate COMMIT in the repo, so that rollback does not roll back all progress. borg2 (as in #8332) does not do repo transactions any more and only "cleans up" if the user is invoking borg compact (which ideally is NOT after each borg create, but less frequently), thus we don't need checkpoint archives any more? removing archive checkpointing would make the code quite a bit simpler. update: removed checkpoint archives and part files. |
current state (current master branch, borg 1.x, borg 0.x, attic)
A borg repository is primarily a key/value store (with some aux functions).
The key is the chunk id (== MAC(plaintext)), the value is the compressed/encrypted/authenticated data.
borg uses transactions and a LOG when writing to the repo:
LOG means that new stuff is always appended at the end of the last/current segment file. In general, old segment files are never modified in place.
borg compact
defrags non-compact segment files:advantages of this approach
disadvantages of this approach
id -> (segment, offset, flags)
The text was updated successfully, but these errors were encountered: