Turbine for Duplicate Block Prevention #71

carllin · 2023-10-12T01:43:03Z

No description provided.

carllin · 2023-10-12T04:31:28Z

wen-coding · 2023-11-13T19:33:48Z

proposals/0057-duplicate-block-prevention.md

+to the FEC set to which it belongs. This means that given an FEC set of minimum
+32 shreds, a leader cannot create an entirely new FEC set by just modifying the
+last shred, because the `witness` in that last shred disambiguates which FEC set
+it belongs to.


This is not very friendly to first-time readers, would be good to add SIMD or other doc here which describes the Merkle shred change introduced.

makes sense i'll link it

wen-coding · 2023-11-13T19:54:38Z

proposals/0057-duplicate-block-prevention.md

+
+1. Lock down shred propagation so that validators only accept shred `X` if it
+arrives from the correct ancestor in the turbine tree for that shred `X`. There
+are a few downstream effects of this:


We can lock down in our validator implementation. But if team X implements and allows their own sideline shred forwarder, how much of the assumption here is broken?

see #71 (comment)

wen-coding · 2023-11-13T22:02:37Z

proposals/0057-duplicate-block-prevention.md

+
+ - In repair, a validator `V` can no longer repair shred `X` from anybody other
+ than the singular ancestor `Y` that was responsible for delivering shred `X` to
+ `V` in the turbine tree.


You mean a validator can only repair from its single parent on the Turbine tree, not grandparents?

We should put that into the doc, something like "from anybody other than the parent (not even from grandparents) ...", ancestors include parents and grandparents, I think.

wen-coding · 2023-11-13T22:05:29Z

proposals/0057-duplicate-block-prevention.md

+
+then if validator `V`'s ancestor `Y` for shred `X` is down, then shred `X` is
+unrecoverable. Without being able to repair a backup erasure shred, this would
+mean validator `X` could never recover this block


Not sure whether this block belong to any bullet above, and what "then if" refers to. And not clear what "this" refers to in "this would".

wen-coding · 2023-11-13T22:07:32Z

proposals/0057-duplicate-block-prevention.md

+that shred `S`' for the same block, it will propagate the witness of both of
+those shreds so that everyone in the turbine tree sees the duplicate proof. This
+makes it harder for leaders to split the network into groups that see a block is
+duplicate and groups that don't.


Is it guaranteed the Turbine tree is always the same if two validators with the same pubkey are physically apart (e.g. setting up hot-standby in US-Europe).

To make the doc self-contained, we should probably list the properties of Turbine we depend on in the doc as well.

wen-coding · 2023-11-13T22:12:34Z

proposals/0057-duplicate-block-prevention.md

+### Duplicate block resolution
+
+Against a powerful adversary, the preventative measures outlined above can be
+circumvented. Namely an adversary that controls a large percentage of stake and


Maybe mention roughly how large a percentage here, I assume >33%?

wen-coding · 2023-11-13T22:14:21Z

proposals/0057-duplicate-block-prevention.md

+In the worse case we can assume that the adversary controls 33% of the network
+stake. By utilizing this stake, they can attack honest nodes by creating network
+partitions. In a turbine setup with offline nodes and malicious stake
+communicating through side channel, simulations show a 1% propagation to honest


What does "1% propagation to honest nodes" mean?

wen-coding · 2023-11-13T22:16:49Z

proposals/0057-duplicate-block-prevention.md

+communicating through side channel, simulations show a 1% propagation to honest
+nodes given at least 15% honest nodes are in the partition. [1]
+
+Median stake recovered with 33% malicious, 10K trials


Maybe explain what "percentage online", "equal stake", "mainnet stake" mean briefly. It would be good for SIMD to be self-contained by roughly describing how the tests were structured.

wen-coding · 2023-11-13T22:20:05Z

proposals/0057-duplicate-block-prevention.md

+
+## Impact
+
+The network will be more resilient against duplicate blocks


Maybe we need to estimate memory/network impact for storing up to 5 duplicate forks?

definitely, will be interesting to run inv tests with 5 partitions on turbine, but a connected gossip & repair. I believe that will allow us to propagate 5 blocks for each slot.
we can compare that to a cluster running only on repair to get a rough idea.

I've heard mentions of storing up to 5 duplicate forks, but what does this actually mean? Does it mean plumbing blockstore to hold up to 5 versions of the same block and keying everything based on Slot Hash?

I don't see it mentioned anywhere in this SIMD, and it's not clear why it would actually be necessary as part of this proposal

that was the original idea, however there has been no solid consensus on whether we need to implement such a change. Originally I had a section in this SIMD with that design carllin@807b5ee#diff-d1443f19931349d37d7a29462e1c96d99f6bd1a4d7b08757dd6360425ae15076L95, but since it is still uncertain I removed it.

I think the scope of this SIMD can be purely on efforts to prevent the propagation of duplicate blocks, and if necessary a later SIMD can speak about the new resolution efforts.

wen-coding · 2023-11-13T22:20:40Z

proposals/0057-duplicate-block-prevention.md

+
+## Backwards Compatibility
+
+Not applicable


What if there are other mods which didn't implement this SIMD?

Also I think it'd be nice to have a "rollout plan". Are all the changes we intend to make compatible to the current status so we don't need any feature gate/adoption etc?

generally anything about duplicate block prevention is local. If you circumvent the turbine tree or run some other shred related mods there's a chance that you can observe duplicate blocks while the rest of the network doesn't.

Since we set the vote threshold at 34%, we'd require 34% of the cluster to be non-compliant with this proposal in order to have a cluster wide impact. At that point compliant validators will see the duplicate blocks that they were prevented from seeing, and will need to use the strategies outlined in duplicate block resolution.

AshwinSekar · 2023-11-17T08:44:34Z

splitting this into 2 simds. the scope of this simd is only for best effort duplicate block prevention.

AshwinSekar · 2023-11-20T22:02:32Z

proposals/0057-duplicate-block-prevention.md

+mean validator `X` could never recover this block
+
+2. If a validator received shred `S` for a block, and then another version of
+that shred `S`' for the same block, it will propagate the witness of both of


After some offline discussion with behzad I think this is an unfeasible strategy in turbine. Sending payloads fragmented over more than one packet introduces a lot of overhead. It seems unwise to introduce this latency in turbine where performance is critical.

However for exact (slot, shred_index, type) duplicate proofs, this will continue to work out of the box like today.

bw-solana · 2023-12-20T15:46:46Z

proposals/0057-duplicate-block-prevention.md

+ `V` in the turbine tree.
+ - Validators need to be able to repair erasure shreds, whereas they can only
+repair data shreds today. This is because now the set of repair peers is locked,
+


Can remove the extra new line

bw-solana · 2023-12-20T15:49:06Z

proposals/0057-duplicate-block-prevention.md

+mean validator `X` could never recover this block
+
+2. If a validator received shred `S` for a block, and then another version of
+that shred `S`' for the same block, it will propagate the witness of both of


Would be good to define what 'witness' means here

bw-solana · 2023-12-20T15:50:24Z

proposals/0057-duplicate-block-prevention.md

+see a duplicate proof when the other canonical version of the shred is
+broadcasted.
+
+3. The last FEC set is unique in that it can have less than 32 data shreds.


Can update this to indicate the new strategy of ensuring fully packed FEC sets

bw-solana · 2023-12-20T16:03:48Z

proposals/0057-duplicate-block-prevention.md

+arrives from the correct ancestor in the turbine tree for that shred `X`. There
+are a few downstream effects of this:
+
+ - In repair, a validator `V` can no longer repair shred `X` from anybody other


This is the only part of the proposal that gives me major heartburn.

Ignoring duplicate blocks for a second, we see quite a few cases where leader transmission to root of turbine tree drops for some period of time and we see lots of shreds drop in a row. In this case, we would need the various roots to request repair from the leader, then their children to request repair from them, etc. until everyone can repair the block. Seems like major latency in getting the block into blockstore so we can replay.

In other words, if you drop a shred near the top of the turbine tree, good luck getting your block confirmed. Obviously I haven't actually collected data to see if my assumptions are true, but I'll remain cautiously pessimistic for now.

One thing that might help is to enable retransmission of repaired shreds.

AshwinSekar · 2024-03-21T15:49:19Z

closing this for now and plan to submit / reopen with a more complete design once we nail it down.

carllin and others added 2 commits September 6, 2023 21:05

duplicate block prevention

53c9f4a

Add duplicate block resolution proposal

111b8ff

carllin changed the title ~~Leveraging turbine for duplicate block prevention~~ Leveraging Turbine for Duplicate Block Prevention Oct 12, 2023

carllin force-pushed the duplicate_block_prevention branch 3 times, most recently from 67b0147 to cabe3fb Compare October 12, 2023 01:50

rename to 57

28d303d

carllin force-pushed the duplicate_block_prevention branch 3 times, most recently from ddf88d8 to 4d62fe8 Compare October 12, 2023 02:04

formatting

e6d12a9

carllin force-pushed the duplicate_block_prevention branch from 4d62fe8 to e6d12a9 Compare October 12, 2023 02:21

carllin changed the title ~~Leveraging Turbine for Duplicate Block Prevention~~ Turbine for Duplicate Block Prevention Oct 12, 2023

behzadnouri mentioned this pull request Oct 12, 2023

serve coding shreds via repair solana-labs/solana#33263

Closed

wen-coding reviewed Nov 13, 2023

View reviewed changes

AshwinSekar added 7 commits November 14, 2023 18:22

pr comments

f21373d

add merkle shreds primer

3a122ef

link turbine doc

98b9735

add tentative rollout schedule

8e48b94

lint

9bbbc24

moar lint

a0f663c

split out resolution

807b5ee

AshwinSekar reviewed Nov 20, 2023

View reviewed changes

AshwinSekar assigned carllin and AshwinSekar Nov 29, 2023

bw-solana reviewed Dec 20, 2023

View reviewed changes

AshwinSekar closed this Mar 21, 2024

0xSol mentioned this pull request Apr 16, 2024

[FD] 0071-Turbine for Duplicate Block Prevention solana-foundation/SIMD-implementations#19

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turbine for Duplicate Block Prevention #71

Turbine for Duplicate Block Prevention #71

carllin commented Oct 12, 2023

carllin commented Oct 12, 2023

wen-coding Nov 13, 2023

AshwinSekar Nov 14, 2023

wen-coding Nov 13, 2023

AshwinSekar Nov 14, 2023

wen-coding Nov 13, 2023

AshwinSekar Nov 14, 2023

wen-coding Nov 16, 2023

wen-coding Nov 13, 2023

wen-coding Nov 13, 2023

AshwinSekar Nov 14, 2023

wen-coding Nov 14, 2023

wen-coding Nov 13, 2023

wen-coding Nov 13, 2023

wen-coding Nov 13, 2023

wen-coding Nov 13, 2023

AshwinSekar Nov 14, 2023

bw-solana Dec 20, 2023

AshwinSekar Dec 20, 2023

wen-coding Nov 13, 2023

wen-coding Nov 14, 2023

AshwinSekar Nov 14, 2023

AshwinSekar commented Nov 17, 2023

AshwinSekar Nov 20, 2023

bw-solana Dec 20, 2023

bw-solana Dec 20, 2023

bw-solana Dec 20, 2023

bw-solana Dec 20, 2023

AshwinSekar commented Mar 21, 2024


		## Impact

		The network will be more resilient against duplicate blocks

Turbine for Duplicate Block Prevention #71

Turbine for Duplicate Block Prevention #71

Conversation

carllin commented Oct 12, 2023

carllin commented Oct 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AshwinSekar commented Nov 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AshwinSekar commented Mar 21, 2024