Skip to content
This repository has been archived by the owner on May 31, 2023. It is now read-only.

Treat decryption failure as a read error? #13

Open
Roman2K opened this issue Mar 17, 2017 · 1 comment
Open

Treat decryption failure as a read error? #13

Roman2K opened this issue Mar 17, 2017 · 1 comment

Comments

@Roman2K
Copy link
Owner

Roman2K commented Mar 17, 2017

Question by @lavalamp originally posted in the defunct GitLab repo (old issue):

I noticed while reading https://github.com/klauspost/reedsolomon:

The final (and important) part is to be able to reconstruct missing shards. For this to work, you need to know which parts of your data is missing. The encoder does not know which parts are invalid, so if data corruption is a likely scenario, you need to implement a hash check for each shard. If a byte has changed in your set, and you don't know which it is, there is no way to reconstruct the data set.

So I thought I'd give it a try and deliberately changed a bit in a test backup. Unfortunately, this resulted in the restore not working (removing the file entirely allowed the restore to succeed, as expected). It seems like the problem is that if the decrypt step fails, the entire restore is aborted. I guess ideally, the decryption failure ought to be treated the same as if the remote shard was missing. Maybe there's a way to fix my restore script?

    uindex | backlog 8 {
      backlog 4 multireader(
        a=cp(/path/to/a)
        b=cp(/path/to/b)
        c=cp(/path/to/c)
      ) |
      cmd gpg --args --to --decode |
      uchecksum |
      group 3 |
      uparity 2 1 |
      cmd unxz
    } |
    join -

(sorry I keep filing issues, I think the concept is pretty cool and I'm attempting to use scat to backup my own files...)

@Roman2K
Copy link
Owner Author

Roman2K commented Mar 17, 2017

My response:

Ah, yes... So, in fact, uparity supports both missing files and integrity check failures.

However in this case, with gpg before uchecksum, you don't get an integrity check failure because gpg fails first due to invalid crypted data.

Indeed, we have to checksum before gpg (and in the restore script, uchecksum after gpg), because gpg generates different crypted data every time, for identical input data.

The consequence is that changing data on remotes results in failed decompression rather than failed integrity check (what I wanted initially, before realizing gpg generates different data every time).

So that's a very good point. I don't know what to do, if considering any error as potentially recoverable by uparity, not just integrity check or missing data 🤔 I'll leave the ticket open until deciding.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant