Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BEFP creation and validation to accomodate more types of validator fraud #2377

Open
evan-forbes opened this issue Jun 16, 2023 · 17 comments

Comments

@evan-forbes
Copy link
Member

Per this comment in nmt defining the attack scenario:

Malicious consensus nodes could produce and commit a block, denoted as B, in which the shares are not ordered lexicographically based on their namespace IDs. A DA (data availability) full node obtains the block header, including the data root denoted as dataRoot, and the underlying shares, and will attempt to construct the block. However, if the node detects that the shares used to construct dataRoot are unordered, it must provide a proof of bad encoding to the light clients. Specifically, it must indicate that there are out-of-order leaves in the tree represented by dataRoot.

per the LL white paper section 5.2

An adversarial consensus node may attempt to produce a block that contains a Merkle tree with children
that are not ordered correctly. To prevent this, we can set a condition in nsHash such that there is no valid hash when leftMaxNs ≥ rightMinNs, and thus there would be no valid Merkle root for incorrectly ordered children. Therefore blockValid(hi) would return false in the simplistic and probabilistic validity rules as there is no possible Mi where root(Mi) = mRooti

addionally, 2/3s of the voting power could commit to shares that not sized properly and would result in an inability to decode the row or col.

We need to have a fraud proof that can handle out of order namespaces and arbitary things that would cause the erasure decoding to fail. BEFPs (perhaps need to be renamed after this), can and should handle these types of fraud.

During the reconstruction process, if there are two shares that are out of order, then we should still be able to create an inclusion proof to half of the shares in that row or col using either the row or column roots. If the square is available, then this will always be the case.

For the second new type of fraud, if there is something wrong with the shares that would cause the row or col to not be decodable, then if the shares are committed to (step 1 below), then that means that the BEFP is valid.

To accomodate this, the verification logic must be altered to encode the following logic:

  1. Check inclusion: if inclusion fails, proof is invalid
  2. Reconstruction of row of col: if fails, proof is valid
  3. Create merkel root: if creation of root fails, proof is valid
  4. Compare existing root: if different, the proof is valid

currently, we are not following steps two and three of the validation process:

// rebuild a row or col.
rebuiltShares, err := codec.Decode(shares)
if err != nil {
return err
}
rebuiltExtendedShares, err := codec.Encode(rebuiltShares[0:odsWidth])
if err != nil {
return err
}
copy(rebuiltShares[odsWidth:], rebuiltExtendedShares)
tree := wrapper.NewErasuredNamespacedMerkleTree(odsWidth, uint(p.Index))
for _, share := range rebuiltShares {
err = tree.Push(share)
if err != nil {
return err
}
}
expectedRoot, err := tree.Root()
if err != nil {
return err
}

cc @renaynay @Wondertan please feel free to add issues, edit, or convert this to an EPIC as needed 🙂

@github-actions github-actions bot added the external Issues created by non node team members label Jun 16, 2023
@evan-forbes evan-forbes removed the external Issues created by non node team members label Jun 16, 2023
@vgonkivs
Copy link
Member

vgonkivs commented Jun 19, 2023

🤔
IIRC, nmt tree panics in the previous versions if shares were not lexicographically ordered.

@Wondertan
Copy link
Member

@evan-forbes, thank you for the detailed issue! Do I understand correctly that the actionable here is to ensure that steps 2 and 3 are followed(and tested) in the implementation?

@evan-forbes
Copy link
Member Author

Do I understand correctly that the actionable here is to ensure that steps 2 and 3 are followed(and tested) in the implementation?

yeah in celestia-node, I believe so yeah. I'm still unsure exactly on what (if anything) needs to be changed on the reconstruction side.

@Wondertan
Copy link
Member

Likely relevant celestiaorg/rsmt2d#178

@vgonkivs
Copy link
Member

vgonkivs commented Jun 21, 2023

During the reconstruction process, if there are two shares that are out of order, then we should still be able to create an inclusion proof for half of the shares in that row or col using either the row or column roots. If the square is available, then this will always be the case.

The problem here is that if two shares of the row will be out of order, then we can't compute the Merkle root of this row, because of this. This means that computeSharesRoot will also return an error that is different from ErrByzantineData.
To handle this case, we need:

  • Create an inclusion proof for half of the shares against the col;
  • Verify the inclusion against Merkle root of the columns;
  • Try to reconstruct and build a tree using shares that were provided - try to build a row.

I guess for this case we will need a different type of fraud proof because validation will defer from BEFP validation.

@adlerjohn could you please confirm that my understanding is correct? Also, I think we will need additional implementation from the rsmt2d side.

@vgonkivs
Copy link
Member

In case we can't build Merkle roots for both row and col, then the entire data is not available

@evan-forbes
Copy link
Member Author

I guess for this case we will need a different type of fraud proof because validation will defer from BEFP validation

is there any way that the BEFP struct that light clients get can be the same? or perhaps we can modify the existing one to handle both cases? In theory, a BEFP just needs any inclusion proof (to row or col roots) for half the shares in a row, correct?

@vgonkivs
Copy link
Member

vgonkivs commented Jun 21, 2023

Yes. But with the current implementation, we are verifying inclusion against a single row/col root. New validation requires multiple roots(each share will have its own root to verify from).

@vgonkivs
Copy link
Member

What we can do is to have a slice of BadEncodingProof structs with a single share and proof inside, iterate through this slice, and verify inclusion.

@adlerjohn
Copy link
Member

The problem here is that if two shares of the row will be out of order, then we can't compute the Merkle root of this row, because of this. This means that computeSharesRoot will also return an error that is different from ErrByzantineData.
To handle this case, we need:

Create an inclusion proof for half of the shares against the col;
Verify the inclusion against Merkle root of the columns;
Try to reconstruct and build a tree using shares that were provided - try to build a row.

Yes, that's correct.

I guess for this case we will need a different type of fraud proof because validation will defer from BEFP validation.

We don't need a different data structure, we just need additional codepaths in the verification logic to handle the two additional cases described in OP. The data structure of the proof can remain the same.

@musalbas
Copy link
Member

Regarding the above proposed way to handle that case, what if both the col and row has shares out of order?

Isn't the solution simply for the BEFP verifier to try and recompute the row/col root, which will error because one of the shares is out of order, and then fraud proof passes?

@evan-forbes
Copy link
Member Author

if both the col and row has shares out of order

Isn't the solution simply for the BEFP verifier to try and recompute the row/col root

I believe that since both the row and col are out of order, then when the verifier attempts to first prove inclusion to those shares, it will fail since both are out of order.

@vgonkivs
Copy link
Member

vgonkivs commented Jun 26, 2023

I believe that since both the row and col are out of order, then when the verifier attempts to first prove inclusion to those shares, it will fail since both are out of order.

But it will mean that the BEFP is invalid.

@evan-forbes
Copy link
Member Author

But it will mean that the BEFP is invalid.

yeah currently that is the case and we will fix to close this PR, correct?

I believe that since both the row and col are out of order, then when the verifier attempts to first prove inclusion to those shares, it will fail since both are out of order.

to clarify this with the above statement in the OP

During the reconstruction process, if there are two shares that are out of order, then we should still be able to create an inclusion proof to half of the shares in that row or col using either the row or column roots. If the square is available, then this will always be the case.

afaiu, we don't have to do anything for the case where both the row and col are out of order, since either the square is available and there exists row/col where the shares can be proven using the opposite col/row roots, or the entire square is not available (sample verification cannot occur because inclusion proofs fail due to nodes in the tree being out of order)

@musalbas
Copy link
Member

musalbas commented Jun 27, 2023

During the reconstruction process, if there are two shares that are out of order, then we should still be able to create an inclusion proof to half of the shares in that row or col using either the row or column roots. If the square is available, then this will always be the case.

Let's think about the end-to-end flow of sampling and reconstruction. Note that shares being out of order is not the only case where you cannot recompute a row or column root; this would also be the case if any reconstructed shares do not match the row or column root. We can conceptually treat an out-of-order share as the same as a recomputed share that does not match the row or column root.

Firstly, it should not be possible to sample a share that has any sibling nodes that are out of order, because the proof NMT verification will fail when it tries to reconstruct the root (due to a mismatch when calculating min and max nid).

Next, we should consider this rule:

Basically, I believe the rule for choosing what proof should be used for a BEFP share should be:

  1. If a share is inside a fully and validly reconstructed row or column, then you can generate an inclusion proof from that row or column.

  2. Otherwise, you need to use Merkle inclusion proofs that the full node received from other nodes via sampling. Example: consider a square where all erasure coding quadrants (quadrants 2, 3, 4) contain garbage data that does not match the actual row or column roots.

From the above, it doesn't seem to me like we need to create inclusion proofs ourselves in rows/cols with shares out of order, due to (2). Does that seem right @adlerjohn @evan-forbes?

@adlerjohn
Copy link
Member

adlerjohn commented Jun 29, 2023

it should not be possible to sample a share that has any sibling nodes that are out of order, because the proof NMT verification will fail when it tries to reconstruct the root

I'm thinking it might be possible if an attacker modifies the NMT code to simply process out-of-order leaves, and the Merkle proof is for a leaf whose neighbor isn't out of order. But this is fine because the case will be caught when the full row/col is reconstructed anyways.

it doesn't seem to me like we need to create inclusion proofs ourselves in rows/cols with shares out of order, due to (2).

That's correct, yes.

@musalbas
Copy link
Member

I'm thinking it might be possible if an attacker modifies the NMT code to simply process out-of-order leaves, and the Merkle proof is for a leaf whose neighbor isn't out of order. But this is fine because the case will be caught when the full row/col is reconstructed anyways.

Exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants