diff --git a/proposals/0057-duplicate-block-prevention.md b/proposals/0057-duplicate-block-prevention.md index 250caa7a2..f4d0473f0 100644 --- a/proposals/0057-duplicate-block-prevention.md +++ b/proposals/0057-duplicate-block-prevention.md @@ -18,11 +18,11 @@ different versions of the block ## Motivation -In a situation where a leader generates two different blocks for a slot, -ideally either all the validators get the same version of the block, or they -all see a mix of the different versions of the block and mark it dead during -replay. This removes the complicated process of reaching consensus on which -version of the block needs to be stored. +In a situation where a leader generates two different blocks for a slot, ideally +either all the validators get the same version of the block, or they all see a +mix of the different versions of the block and mark it dead during replay. This +removes the complicated process of reaching consensus on which version of the +block needs to be stored. ## Alternatives Considered @@ -42,12 +42,13 @@ None With the introduction of Merkle shreds, each shred is now uniquely attributable to the FEC set to which it belongs. This means that given an FEC set of minimum 32 shreds, a leader cannot create an entirely new FEC set by just modifying the -last shred, because the `witness` in that last shred disambiguates which FEC -set it belongs to. +last shred, because the `witness` in that last shred disambiguates which FEC set +it belongs to. This means that in order for a leader to force validators `A` and `B` to ingest -a separate version `N` and `N'` of a block, they must at a minimum create and -propagate two completely different versions of an FEC set. Given the smallest FEC set of 32 shreds, this means that 32 shreds from one version must arrive to +a separate version `N` and `N'` of a block, they must at a minimum create and +propagate two completely different versions of an FEC set. Given the smallest +FEC set of 32 shreds, this means that 32 shreds from one version must arrive to validator `A`, and 32 completely different shreds from the other version must arrive to validator `B`. @@ -57,29 +58,36 @@ each shred's traversal through turbine via the following set of changes: 1. Lock down shred propagation so that validators only accept shred `X` if it arrives from the correct ancestor in the turbine tree for that shred `X`. There are a few downstream effects of this: + - In repair, a validator `V` can no longer repair shred `X` from anybody other - than the singular - ancestor `Y` that was responsible for delivering shred `X` to `V` in the - turbine tree. + than the singular ancestor `Y` that was responsible for delivering shred `X` to + `V` in the turbine tree. - Validators need to be able to repair erasure shreds, whereas they can only - repair data shreds today. This is because now the set of repair peers is locked, then if validator `V`'s ancestor `Y` for shred `X` is down, then shred `X` is unrecoverable. Without being able to repair a backup erasure shred, this would mean validator `X` could never recover this block +repair data shreds today. This is because now the set of repair peers is locked, + +then if validator `V`'s ancestor `Y` for shred `X` is down, then shred `X` is +unrecoverable. Without being able to repair a backup erasure shred, this would +mean validator `X` could never recover this block 2. If a validator received shred `S` for a block, and then another version of that shred `S`' for the same block, it will propagate the witness of both of -those shreds so that everyone in the turbine tree sees the duplicate proof. -This makes it harder for leaders to split the network into groups that see a -block is duplicate and groups that don't. +those shreds so that everyone in the turbine tree sees the duplicate proof. This +makes it harder for leaders to split the network into groups that see a block is +duplicate and groups that don't. Note these duplicate proofs still need to gossiped because it's not guaranteed duplicate shreds will propaagate to everyone if there's a network partition, or -a colluding malicious root node in turbine. For instance, assuming 1 malicious root node `X`, `X` can forward one version of the shred to one specific +a colluding malicious root node in turbine. For instance, assuming 1 malicious +root node `X`, `X` can forward one version of the shred to one specific validator `Y` only, and then only descendants of validator `Y` would possibly see a duplicate proof when the other canonical version of the shred is broadcasted. 3. In order to account for the last FEC set potentially having a 1:32 split of -data to coding shreds, we enforce that validators must see at least half the block before voting on the block, *even if they received all the data shreds for that block*. This guarantees leaders cannot just change the one data shred -to generate two completely different, yet playable versions of the block +data to coding shreds, we enforce that validators must see at least half the +block before voting on the block, *even if they received all the data shreds for +that block*. This guarantees leaders cannot just change the one data shred to +generate two completely different, yet playable versions of the block ### Duplicate block resolution @@ -87,10 +95,14 @@ Against a powerful adversary, the preventative measures outlined above can be circumvented. Namely an adversary that controls a large percentage of stake and has the ability to create and straddle network partitions can circumvent the measures by isolating honest nodes in partitions. Within the partition the -adversaries can propagate a single version of the block, nullifying the effects of the duplicate witness proof. +adversaries can propagate a single version of the block, nullifying the effects +of the duplicate witness proof. In the worse case we can assume that the adversary controls 33% of the network -stake. By utilizing this stake, they can attack honest nodes by creating network partitions. In a turbine setup with offline nodes and malicious stake communicating through side channel, simulations show a 1% propagation to honest nodes given at least 15% honest nodes are in the partition. [1] +stake. By utilizing this stake, they can attack honest nodes by creating network +partitions. In a turbine setup with offline nodes and malicious stake +communicating through side channel, simulations show a 1% propagation to honest +nodes given at least 15% honest nodes are in the partition. [1] Median stake recovered with 33% malicious, 10K trials | Percentage online | Equal stake | Mainnet stake | @@ -112,33 +124,47 @@ Median stake recovered with 33% malicious, 10K trials | 66% | 64.05% | 64.08% | | 75% | 74.98% | 74.59% | -Given this we can conclude that there will be at most 5 versions of a block that can reach a 34% vote threshold, even against the most powerful adversaries, as there needs to be a non overlapping 15% honest nodes in each partition. [2] +Given this we can conclude that there will be at most 5 versions of a block that +can reach a 34% vote threshold, even against the most powerful adversaries, as +there needs to be a non overlapping 15% honest nodes in each partition. [2] -To solve this case we can store up to 5 duplicate forks as normal forks, and +To solve this case we can store up to 5 duplicate forks as normal forks, and perform normal fork choice on them: -* Allow blockstore to store up to 5 versions of a block. -* Only one of these versions can be populated by turbine. The remaining 4 versions are only for repair. -* If a version of this slot reaches the 34% vote threshold, attempt to repair that block. This inherently cannot be from a turbine parent, -so it must relax the constraint from the prevention design. -* From this point on, we treat the fork as normal in fork choice. This requires that the remaining parts of consensus operate on (Slot, Hash) ids, -and that switching proofs allow stake on the same slot, but different hashes. -* Include the same duplicate witness proofs from the prevention design, and only vote on blocks that we have not received a proof for, or that have -reached the threshold. - -In order to accurately track the threshold, it might be prudent to tally vote txs from dead blocks as well, in the case gossip is experiencing problems. -Alternatively/Additionally consider some form of ancestory information in votes [3] to ensure that the vote threshold is viewed. This might be a necessity -in double duplicate split situations where the initial duplicate block is not voted on. + +- Allow blockstore to store up to 5 versions of a block. +- Only one of these versions can be populated by turbine. The remaining 4 + versions are only for repair. +- If a version of this slot reaches the 34% vote threshold, attempt to repair +that block. This inherently cannot be from a turbine parent, so it must relax +the constraint from the prevention design. +- From this point on, we treat the fork as normal in fork choice. This requires +that the remaining parts of consensus operate on (Slot, Hash) ids, and that +switching proofs allow stake on the same slot, but different hashes. +- Include the same duplicate witness proofs from the prevention design, and only +vote on blocks that we have not received a proof for, or that have reached the +threshold. + +In order to accurately track the threshold, it might be prudent to tally vote +txs from dead blocks as well, in the case gossip is experiencing problems. +Alternatively/Additionally consider some form of ancestory information in votes +[3] to ensure that the vote threshold is viewed. This might be a necessity in +double duplicate split situations where the initial duplicate block is not voted +on. ## Impact + The network will be more resilient against duplicate blocks ## Security Considerations + Not applicable ## Backwards Compatibility + Not applicable ## References + [1] Equal stake weight simulation `https://github.com/AshwinSekar/turbine-simulation/blob/master/src/main.rs` uses a 10,000 node network with equal stake and shred recovery. Mainnet stake weight simulation `https://github.com/AshwinSekar/solana/commits/turbine-simulation` mimics the exact node count and stake distribution of mainnet and does not perform shred recovery.