v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

mergify · 2024-09-30T17:39:01Z

Problem

max_bytes for outgoing push messages is pretty outdated and does not allow gossip to function properly with current testnet cluster size.
In particular it does not allow to clear out queue of pending push messages unless the new_push_messages function is called very frequently which involves repeatedly locking/unlocking CRDS table.
Additionally leaving gossip entries in the queue for the next round will add delay to propagating push messages which can compound as messages go through several hops.

Summary of Changes

The commit reworks outbound limit to allow more messages to be pushed out each time new_push_messages is invoked.

This is an automatic backport of pull request #3016 done by Mergify.

max_bytes for outgoing push messages is pretty outdated and does not allow gossip to function properly with current testnet cluster size. In particular it does not allow to clear out queue of pending push messages unless the new_push_messages function is called very frequently which involves repeatedly locking/unlocking CRDS table. Additionally leaving gossip entries in the queue for the next round will add delay to propagating push messages which can compound as messages go through several hops. (cherry picked from commit 489f483)

bw-solana · 2024-09-30T20:56:06Z

gossip/src/crds_gossip_push.rs

@@ -180,10 +174,10 @@ impl CrdsGossipPush {
        usize, // number of values
        usize, // number of push messages
    ) {
+        const MAX_NUM_PUSHES: usize = 1 << 12;


Why was 4k chosen? Any idea how close we get to this limit during steady state? Or what we burst up to?

If we were to sustain at this rate, how much egress would that be? Something like 300Mbps?

Mostly based on testing on an unstaked node in east asia, mainnet/testnet metrics plus some margin for spikes. This should leave enough margin during steady state with the caveat below.

How often this function is invoked partly depends on how often the node receives push messages. So I don't have a mathematical mapping between this limit and the egress rate. But I think we have enough metrics to monitor this on testnet and get some estimation.

For now this seems like a working patch addressing contact-info propagation issue for unstaked east asia nodes.

The max possible limit is actually pretty high which may be a concern. Some napkin math puts our max rate at about 4.4 GB/s:

910 calls to new_push_requests/s 4096 CrdsValues/call Max CrdsValue size: 1187 bytes/CrdsValue 910 call/s * 4096 CrdsValue/call = 3,727,360 CrdsValue/s 3,727,360 CrdsValue/s * 1187 bytes/CrdsValue = 4.424 GB/s

^ Although in order to be sending that much traffic, the node would have to be receiving at the lowest: 491 MB/s

4.24GB/s / 9 push fanout peers = 491 MB/s

with a staked node on testnet, it is closer to:

4.24GB/s / 4 push fanout peers = 1.1 GB/s

With this same staked node on testnet:
Push Burst @ ~192 MB/s on validator startup.
Push Steady State @ ~10MB/s

So our steady state push bandwidth is pretty low. But, the peak is pretty significant and will likely be higher for higher staked nodes.

4096 is chosen to ensure an unstaked node, not receiving any push messages, can keep up with the demand of incoming pull requests while also being able to send out its own ContactInfo. Initially an unstaked node only receives data via pull request. So, the data from the pull responses fills up its table quickly. The problem is, the node cannot push out all of the new CrdsValues quick enough before the node refreshes its ContactInfo in the table and the node's ContactInfo gets pushed to the end of the table.

Out of the 910 calls to new_push_requests per second, only 10 of them are run within the run_gossip set of threads. The other ~900/s are called via the handle_batch_push_messages set of threads. But if a node is not receiving any push messages ,handle_batch_push_messages is exited early so no calls to new_push_requests are made.

As a result, the node only has 10 threads dedicated to sending push messages, but those 10 threads in their previous state (before this PR), cannot send enough push messages before the node's ContactInfo gets refreshed.

All that is to say, that we need a high enough limit in the new_push_requests function to send enough data so that the node can send its ContactInfo via push using just the run_gossip threads before it gets refreshed.

I think that max calculation does not take into account that the frequency the function is called limits how many push messages are generated in each call.

For example for the function to be called ~1000 times per second then each call should only take 1ms in which case it cannot generate many push messages in that short period of time.

iow more frequent calls => fewer push messages per call.

ahh you are right it does not. good point

bw-solana

LGTM

gregcusack

lgtm!

mergify bot assigned behzadnouri Sep 30, 2024

mergify bot requested a review from a team as a code owner September 30, 2024 17:39

behzadnouri requested a review from gregcusack September 30, 2024 20:51

bw-solana reviewed Sep 30, 2024

View reviewed changes

behzadnouri requested a review from bw-solana October 1, 2024 00:01

bw-solana approved these changes Oct 1, 2024

View reviewed changes

gregcusack approved these changes Oct 1, 2024

View reviewed changes

behzadnouri added the automerge automerge Merge this Pull Request automatically once CI passes label Oct 1, 2024

mergify bot merged commit 4f423a5 into v2.0 Oct 1, 2024
39 checks passed

mergify bot deleted the mergify/bp/v2.0/pr-3016 branch October 1, 2024 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

mergify bot commented Sep 30, 2024

bw-solana Sep 30, 2024

behzadnouri Oct 1, 2024

gregcusack Oct 1, 2024

behzadnouri Oct 1, 2024

gregcusack Oct 1, 2024

bw-solana left a comment

gregcusack left a comment

v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

Conversation

mergify bot commented Sep 30, 2024

Problem

Summary of Changes

bw-solana Sep 30, 2024

Choose a reason for hiding this comment

behzadnouri Oct 1, 2024

Choose a reason for hiding this comment

gregcusack Oct 1, 2024

Choose a reason for hiding this comment

behzadnouri Oct 1, 2024

Choose a reason for hiding this comment

gregcusack Oct 1, 2024

Choose a reason for hiding this comment

bw-solana left a comment

Choose a reason for hiding this comment

gregcusack left a comment

Choose a reason for hiding this comment