Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.0: reworks max number of outgoing push messages (backport of #3016) #3038

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Sep 30, 2024

Problem

max_bytes for outgoing push messages is pretty outdated and does not allow gossip to function properly with current testnet cluster size.
In particular it does not allow to clear out queue of pending push messages unless the new_push_messages function is called very frequently which involves repeatedly locking/unlocking CRDS table.
Additionally leaving gossip entries in the queue for the next round will add delay to propagating push messages which can compound as messages go through several hops.

Summary of Changes

The commit reworks outbound limit to allow more messages to be pushed out each time new_push_messages is invoked.


This is an automatic backport of pull request #3016 done by Mergify.

max_bytes for outgoing push messages is pretty outdated and does not
allow gossip to function properly with current testnet cluster size.

In particular it does not allow to clear out queue of pending push
messages unless the new_push_messages function is called very frequently
which involves repeatedly locking/unlocking CRDS table.

Additionally leaving gossip entries in the queue for the next round will
add delay to propagating push messages which can compound as messages go
through several hops.

(cherry picked from commit 489f483)
@mergify mergify bot requested a review from a team as a code owner September 30, 2024 17:39
@@ -180,10 +174,10 @@ impl CrdsGossipPush {
usize, // number of values
usize, // number of push messages
) {
const MAX_NUM_PUSHES: usize = 1 << 12;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was 4k chosen? Any idea how close we get to this limit during steady state? Or what we burst up to?

If we were to sustain at this rate, how much egress would that be? Something like 300Mbps?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly based on testing on an unstaked node in east asia, mainnet/testnet metrics plus some margin for spikes. This should leave enough margin during steady state with the caveat below.

How often this function is invoked partly depends on how often the node receives push messages. So I don't have a mathematical mapping between this limit and the egress rate. But I think we have enough metrics to monitor this on testnet and get some estimation.

For now this seems like a working patch addressing contact-info propagation issue for unstaked east asia nodes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max possible limit is actually pretty high which may be a concern. Some napkin math puts our max rate at about 4.4 GB/s:

910 calls to new_push_requests/s
4096 CrdsValues/call
Max CrdsValue size: 1187 bytes/CrdsValue

910 call/s * 4096 CrdsValue/call = 3,727,360 CrdsValue/s
3,727,360 CrdsValue/s * 1187 bytes/CrdsValue = 4.424 GB/s

^ Although in order to be sending that much traffic, the node would have to be receiving at the lowest: 491 MB/s

4.24GB/s / 9 push fanout peers = 491 MB/s

with a staked node on testnet, it is closer to:

4.24GB/s / 4 push fanout peers = 1.1 GB/s

With this same staked node on testnet:
Push Burst @ ~192 MB/s on validator startup.
Push Steady State @ ~10MB/s

So our steady state push bandwidth is pretty low. But, the peak is pretty significant and will likely be higher for higher staked nodes.

4096 is chosen to ensure an unstaked node, not receiving any push messages, can keep up with the demand of incoming pull requests while also being able to send out its own ContactInfo. Initially an unstaked node only receives data via pull request. So, the data from the pull responses fills up its table quickly. The problem is, the node cannot push out all of the new CrdsValues quick enough before the node refreshes its ContactInfo in the table and the node's ContactInfo gets pushed to the end of the table.

Out of the 910 calls to new_push_requests per second, only 10 of them are run within the run_gossip set of threads. The other ~900/s are called via the handle_batch_push_messages set of threads. But if a node is not receiving any push messages ,handle_batch_push_messages is exited early so no calls to new_push_requests are made.

As a result, the node only has 10 threads dedicated to sending push messages, but those 10 threads in their previous state (before this PR), cannot send enough push messages before the node's ContactInfo gets refreshed.

All that is to say, that we need a high enough limit in the new_push_requests function to send enough data so that the node can send its ContactInfo via push using just the run_gossip threads before it gets refreshed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that max calculation does not take into account that the frequency the function is called limits how many push messages are generated in each call.

For example for the function to be called ~1000 times per second then each call should only take 1ms in which case it cannot generate many push messages in that short period of time.

iow more frequent calls => fewer push messages per call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh you are right it does not. good point

@behzadnouri behzadnouri requested a review from bw-solana October 1, 2024 00:01
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@behzadnouri behzadnouri added the automerge automerge Merge this Pull Request automatically once CI passes label Oct 1, 2024
@mergify mergify bot merged commit 4f423a5 into v2.0 Oct 1, 2024
39 checks passed
@mergify mergify bot deleted the mergify/bp/v2.0/pr-3016 branch October 1, 2024 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge automerge Merge this Pull Request automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants