Add duplicate block resolution proposal

solana-foundation · Oct 9, 2023 · 111b8ff · 111b8ff
1 parent 53c9f4a
commit 111b8ff
Showing 1 changed file with 143 additions and 0 deletions.
diff --git a/proposals/0050-duplicate-block-prevention.md b/proposals/0050-duplicate-block-prevention.md
@@ -0,0 +1,143 @@
+---
+simd: '0050'
+title: Duplicate Block Prevention
+authors:
+  - Carl Lin
+  - Ashwin Sekar
+category: Standard
+type: Core
+status: Draft
+created: 2023-09-06
+feature: (fill in with feature tracking issues once accepted)
+---
+
+## Summary
+
+Duplicate block handling is slow and error prone when different validators see different
+versions of the block
+
+## Motivation
+
+In a situation where a leader generates two different blocks for a slot, ideally either all
+the validators get the same version of the block, or they all see a mix of the different
+versions of the block and mark it dead during replay. This removes the complicated process
+of reaching consensus on which version of the block needs to be stored.
+
+## Alternatives Considered
+
+1. Storing all or some 'n' versions of the block - This can be DOS'd if a malicious leader
+generates a bunch of different versions of a block, or selectively sends some versions
+to specific validators.
+
+2. Running separate consensus mechanism on each duplicate block - Very complicated and
+relies on detection of the duplicate block
+
+## New Terminology
+
+None
+
+## Detailed Design
+
+With the introduction of Merkle shreds, each shred is now uniquely attributable to the FEC
+set to which it belongs. This means that given an FEC set of minimum 32 shreds, a leader cannot
+create an entirely new FEC set by just modifying the last shred, because the `witness` in that last
+shred disambiguates which FEC set it belongs to.
+
+This means that in order for a leader to force validators `A` and `B` to ingest a separate version
+`N` and `N'` of a block, they must at a minimum create and propagate two completely different versions
+of an FEC set. Given the smallest FEC set of 32 shreds, this means that 32 shreds from one version
+must arrive to validator `A`, and 32 completely different shreds from the other version must arrive to
+validator `B`.
+
+We aim to make this process as hard as possible by leveraging the randomness of each shred's traversal
+through turbine via the following set of changes:
+
+1. Lock down shred propagation so that validators only accept shred `X` if it arrives from the correct
+ancestor in the turbine tree for that shred `X`. There are a few downstream effects of this:
+ - In repair, a validator `V` can no longer repair shred `X` from anybody other than the singular
+ ancestor `Y` that was responsible for delivering shred `X` to `V` in the turbine tree.
+ - Validators need to be able to repair erasure shreds, whereas they can only repair data shreds today.
+ This is because now the set of repair peers is locked, then if validator `V`'s ancestor `Y` for shred `X`
+ is down, then shred `X` is unrecoverable. Without being able to repair a backup erasure shred, this would
+ mean validator `X` could never recover this block
+
+2. If a validator received shred `S` for a block, and then another version of that shred `S`' for the same block,
+it will propagate the witness of both of those shreds so that everyone in the turbine tree sees the duplicate proof.
+This makes it harder for leaders to split the network into groups that see a block is duplicate and groups that don't.
+
+Note these duplicate proofs still need to gossiped because it's not guaranteed duplicate shreds will propaagate to
+everyone if there's a network partition, or a colluding malicious root node in turbine. For instance,
+assuming 1 malicious root node `X`, `X` can forward one version of the shred to one specific validator `Y` only,
+and then only descendants of validator `Y` would possibly see a duplicate proof when the other canonical version
+of the shred is broadcasted.
+
+3. In order to account for the last FEC set potentially having a 1:32 split of data to coding shreds, we enforce
+that validators must see at least half the block before voting on the block, *even if they received all the data shreds
+for that block*. This guarantees leaders cannot just change the one data shred to generate two completely
+different, yet playable versions of the block
+
+
+### Duplicate block resolution
+
+Against a powerful adversary, the preventative measures outlined above can be circumvented. Namely an adversary that controls
+a large percentage of stake and has the ability to create and straddle network partitions can circumvent the measures by
+isolating honest nodes in partitions. Within the partition the adversaries can propagate a single version of the block, nullifying
+the effects of the duplicate witness proof.
+
+In the worse case we can assume that the adversary controls 33% of the network stake. By utilizing this stake, they can attack honest
+nodes by creating network partitions. In a turbine setup with offline nodes and malicious stake communicating through side channel,
+simulations show a 1% propagation to honest nodes given at least 15% honest nodes are in the partition. [1]
+
+Median stake recovered with 33% malicious, 10K trials
+| Percentage online | Equal stake | Mainnet stake |
+| ----|-----|-----------|
+| 33% | 33% | 33%       |
+| 40% | 33% | 33%       |
+| 45% | 33.3% | 33.09%  |
+| 46% | 33.4% | 33.46%  |
+| 47% | 33.54% | 33.58% |
+| 48% | 33.71% | 34.78% |
+| 49% | 33.97% | 36.21% |
+| 50% | 34.28% | 39.93% |
+| 51% | 34.70% | 42.13% |
+| 52% | 35.09% | 43.42% |
+| 53% | 35.85% | 45.23% |
+| 54% | 36.88% | 46.42% |
+| 55% | 37.96% | 47.95% |
+| 60% | 48.95% | 55.51% |
+| 66% | 64.05% | 64.08% |
+| 75% | 74.98% | 74.59% |
+
+Given this we can conclude that there will be at most 5 versions of a block that can reach a 34% vote threshold, even against the most
+powerful adversaries, as there needs to be a non overlapping 15% honest nodes in each partition. [2]
+
+To solve this case we can store up to 5 duplicate forks as normal forks, and perform normal fork choice on them:
+* Allow blockstore to store up to 5 versions of a block.
+* Only one of these versions can be populated by turbine. The remaining 4 versions are only for repair.
+* If a version of this slot reaches the 34% vote threshold, attempt to repair that block. This inherently cannot be from a turbine parent,
+so it must relax the constraint from the prevention design.
+* From this point on, we treat the fork as normal in fork choice. This requires that the remaining parts of consensus operate on (Slot, Hash) ids,
+and that switching proofs allow stake on the same slot, but different hashes.
+* Include the same duplicate witness proofs from the prevention design, and only vote on blocks that we have not received a proof for, or that have
+reached the threshold.
+
+In order to accurately track the threshold, it might be prudent to tally vote txs from dead blocks as well, in the case gossip is experiencing problems.
+Alternatively/Additionally consider some form of ancestory information in votes [3] to ensure that the vote threshold is viewed. This might be a necessity
+in double duplicate split situations where the initial duplicate block is not voted on.
+
+## Impact
+The network will be more resilient against duplicate blocks
+
+## Security Considerations
+Not applicable
+
+## Backwards Compatibility
+Not applicable
+
+## References
+[1] Equal stake weight simulation `https://github.com/AshwinSekar/turbine-simulation/blob/master/src/main.rs` uses a 10,000 node network with equal stake and shred recovery.
+    Mainnet stake weight simulation `https://github.com/AshwinSekar/solana/commits/turbine-simulation` mimics the exact node count and stake distribution of mainnet and does not perform shred recovery.
+
+[2] Section 4 `https://github.com/AshwinSekar/turbine-simulation/blob/master/Turbine_Merkle_Shred_analysis.pdf`
+
+[3] Block Ancestors Proposal `https://github.com/solana-labs/solana/pull/19194/files`