Skip to content
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.
Darren Ldl edited this page Apr 13, 2019 · 7 revisions

Can this replace parchive2?

No. Parchive2 is much more powerful in capability - the parity files can repair blocks anywhere in the data files rather than just blocks in the same block set. With EC-SeqBox, if the no. of blocks damaged in a block set is greater than the number of parity blocks, there are no ways to recover from it. Burst error resistance scheme helps alleviate that slightly by decreasing chances of burst errors damaging the entire block set, but still nowhere near what parchive2 is capable of.

However, blkar is much simpler to use in most cases as only one archive is generated - you don't have to store the parity files separately.

Should I use this or parchive2?

Depends on if you care more about sector level data recovery or error correction. Using EC-SeqBox will give you some level of error correction capability, but still nowhere near what parchive2 can provide (see above). SeqBox's (and EC-SeqBox's) strength lies in its simplicity (encoding and decoding operations are fast), and is designed with sector level data recovery in mind.

With sufficiently high parity shard count burst error resistance level, blkar might suffice in the specific scenario you have in mind. But you should plan carefully regardless of the tool you pick.

I have something really important, what should I use?

Ideally you would use both, as parchive2 does not facilitate sector level data recovery, but the SBX format does. By using parchive2 and plain SeqBox, you get both easy sector level recovery and powerful error correction capability for your file. See extremely resilient archive for more info on this.

When should I use EC-SeqBox then?

Personally I added EC-SeqBox since I wanted a single archive with sector level recovery capability as well as error correction, while keeping the scheme relatively simple. Parchive2 requires storing multiple files (data file + some parity files) which always annoyed me slightly.

Ultimately it depends on what error pattern you're expecting on your storage/transfer medium. If it's a network transfer, then par2 is probably better, if it's stored on disk, then you'll have to evaluate what kind of disk damage you want to avoid, if it's a CD/DVD, software specialised in that area (e.g. dvdrescue) are probably better, and so on.

You said this project prioritise security/robustness above all else, what do you mean specifically?

There have been a lot of efforts put in to ensure things are working and done right, following are some of the items to illustrate the point

  • Code refactoring is done through out development, so the code should be fairly readable. There are some unfortunate messes at places to deal with exceptional cases or for smoother user interaction, but they should not impede your understanding of the core code ideally.

  • A fairly heavy test suite is in place. If you have a combination of options in mind which you think might break blkar, do give it a try, though it's likely already covered in the test suite (do open an issue if you find breakage). New tests are added whenever a few feature is added, and the updated version of blkar is tested against regression using the test suite as well.

  • Bugs are dealt with seriously. When a bug is discovered, the impact of the bug is investigated to sufficient depth and documented in the changelog, allowing users to assess whether the bug would have impacted their archives. The thorough changelog also allows navigating bug history easily should the introduced fixes be inadequate.

Why did you fit so many options and features into blkar?

The main reason is just I feel it's really good to have a trusty tool for data archiving with high recoverability as a sysadmin or person of similar role.

As a result, the design geared toward making blkar very capable and versatile, while being robust. The design mindset was essentially considering a sysadmin using blkar in a very tired state for recovering something really, really important, and blkar should be the final stop for most cases, and should absolutely be the last thing that falls apart in this stressful situation.

Unfortunately this means a lot of testing needs to be done to be somewhat confident that blkar doens't fall apart the moment something doesn't look right. Fortunately said tests are in place, hence the amount of bash code you see in this repository.