Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider connection count of staked nodes when calculating allowed PPS #943

Conversation

lijunwangs
Copy link

Problem

The connection count was not considered in the staked node which can allow the stake node to reach 8 times of the designed rate allowed for it.

Summary of Changes

Divide the allowed PPS for staked node among its concurrent connections.
The StreamCounter is enhanced with the connection_count. The connection count is maintained when the connection is added and removed accordingly.

Fixes #

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 95.23810% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 81.8%. Comparing base (6aacbf3) to head (d5ebd8f).

Additional details and impacted files
@@            Coverage Diff            @@
##           master     #943     +/-   ##
=========================================
- Coverage    81.9%    81.8%   -0.1%     
=========================================
  Files         853      853             
  Lines      231812   231831     +19     
=========================================
- Hits       189867   189846     -21     
- Misses      41945    41985     +40     

Copy link

@ryleung-solana ryleung-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a concrete reported issue for this? It seems like we're at this point theorizing on potential spam/attacks, and going a little overboard restricting everything, with potentially unintended consequences. I would prefer a specific reproducible attack that this is shown to solve, as well as perf benchmarks to show a lack of regression in the "happy" or "attack free" case.

@lijunwangs
Copy link
Author

Do we have a concrete reported issue for this? It seems like we're at this point theorizing on potential spam/attacks, and going a little overboard restricting everything, with potentially unintended consequences. I would prefer a specific reproducible attack that this is shown to solve, as well as perf benchmarks to show a lack of regression in the "happy" or "attack free" case.

This is to enforce our design -- the designed PPS for the whole port is set to a number, without doing this, it is easily going over the desired limit when there are multiple connections from other staked node. We observed this phenomena in mainnet via metrics. This is not theory -- it is based on design intent and the metrics.

@lijunwangs
Copy link
Author

Look at the packets sent down on mb on tpu forward port -- our designed limit was 500k PPS, and see how far it can be exceeded and it was exceeded.

Screenshot 2024-04-22 at 8 03 33 AM

@lijunwangs
Copy link
Author

That said, I checked with @pgarg66 -- ConnectionStreamCounter itself is already considering stream count across all connections for a given staked node and the limit is also set across the connections. So it should work in master. It is only an issue in v1.17 based branch where we count streams per connection.

@lijunwangs lijunwangs closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants