-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.0: reworks max number of outgoing push messages (backport of #3016) #3038
Conversation
max_bytes for outgoing push messages is pretty outdated and does not allow gossip to function properly with current testnet cluster size. In particular it does not allow to clear out queue of pending push messages unless the new_push_messages function is called very frequently which involves repeatedly locking/unlocking CRDS table. Additionally leaving gossip entries in the queue for the next round will add delay to propagating push messages which can compound as messages go through several hops. (cherry picked from commit 489f483)
@@ -180,10 +174,10 @@ impl CrdsGossipPush { | |||
usize, // number of values | |||
usize, // number of push messages | |||
) { | |||
const MAX_NUM_PUSHES: usize = 1 << 12; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was 4k chosen? Any idea how close we get to this limit during steady state? Or what we burst up to?
If we were to sustain at this rate, how much egress would that be? Something like 300Mbps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly based on testing on an unstaked node in east asia, mainnet/testnet metrics plus some margin for spikes. This should leave enough margin during steady state with the caveat below.
How often this function is invoked partly depends on how often the node receives push messages. So I don't have a mathematical mapping between this limit and the egress rate. But I think we have enough metrics to monitor this on testnet and get some estimation.
For now this seems like a working patch addressing contact-info propagation issue for unstaked east asia nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max possible limit is actually pretty high which may be a concern. Some napkin math puts our max rate at about 4.4 GB/s:
910 calls to new_push_requests/s
4096 CrdsValues/call
Max CrdsValue size: 1187 bytes/CrdsValue
910 call/s * 4096 CrdsValue/call = 3,727,360 CrdsValue/s
3,727,360 CrdsValue/s * 1187 bytes/CrdsValue = 4.424 GB/s
^ Although in order to be sending that much traffic, the node would have to be receiving at the lowest: 491 MB/s
4.24GB/s / 9 push fanout peers = 491 MB/s
with a staked node on testnet, it is closer to:
4.24GB/s / 4 push fanout peers = 1.1 GB/s
With this same staked node on testnet:
Push Burst @ ~192 MB/s on validator startup.
Push Steady State @ ~10MB/s
So our steady state push bandwidth is pretty low. But, the peak is pretty significant and will likely be higher for higher staked nodes.
4096 is chosen to ensure an unstaked node, not receiving any push messages, can keep up with the demand of incoming pull requests while also being able to send out its own ContactInfo. Initially an unstaked node only receives data via pull request. So, the data from the pull responses fills up its table quickly. The problem is, the node cannot push out all of the new CrdsValues quick enough before the node refreshes its ContactInfo in the table and the node's ContactInfo gets pushed to the end of the table.
Out of the 910 calls to new_push_requests
per second, only 10 of them are run within the run_gossip
set of threads. The other ~900/s are called via the handle_batch_push_messages
set of threads. But if a node is not receiving any push messages ,handle_batch_push_messages
is exited early so no calls to new_push_requests
are made.
As a result, the node only has 10 threads dedicated to sending push messages, but those 10 threads in their previous state (before this PR), cannot send enough push messages before the node's ContactInfo gets refreshed.
All that is to say, that we need a high enough limit in the new_push_requests
function to send enough data so that the node can send its ContactInfo via push using just the run_gossip
threads before it gets refreshed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that max calculation does not take into account that the frequency the function is called limits how many push messages are generated in each call.
For example for the function to be called ~1000 times per second then each call should only take 1ms in which case it cannot generate many push messages in that short period of time.
iow more frequent calls => fewer push messages per call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh you are right it does not. good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Problem
max_bytes
for outgoing push messages is pretty outdated and does not allow gossip to function properly with current testnet cluster size.In particular it does not allow to clear out queue of pending push messages unless the
new_push_messages
function is called very frequently which involves repeatedly locking/unlocking CRDS table.Additionally leaving gossip entries in the queue for the next round will add delay to propagating push messages which can compound as messages go through several hops.
Summary of Changes
The commit reworks outbound limit to allow more messages to be pushed out each time
new_push_messages
is invoked.This is an automatic backport of pull request #3016 done by Mergify.