Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: create helper function RelayRecoveredSig inside peerman #6423

Conversation

PastaPastaPasta
Copy link
Member

Issue being fixed or feature implemented

High m_nodes_mutex lock contention during high load

What was done?

this commit should have a few benefits:

  1. previous logic using ForEachNode results in locking m_nodes_mutex, a highly contended RecursiveMutex, AND m_peer_mutex(in GetPeerRef)
  2. prior also resulted in calling .find over the m_peer_map for each node. Basically old logic was (probably) O(n(nlogn) the new logic results in acquiring m_peer_mutex once and looping over the list of peers, (probably) O(n)
  3. Moves networking logic out of llmq/ and into actual net_processing.cpp

How Has This Been Tested?

Hasn't really yet; it builds, but I need to run tests / maybe deploy to testnet mn

Breaking Changes

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@PastaPastaPasta PastaPastaPasta added this to the 22.1 milestone Nov 21, 2024
@PastaPastaPasta PastaPastaPasta force-pushed the refactor-net-min-m_nodes_mutex-contention-llmq branch from 9ef40dc to 09a483d Compare November 21, 2024 19:52
@PastaPastaPasta PastaPastaPasta marked this pull request as ready for review November 21, 2024 20:35
@DashCoreAutoGuix
Copy link

Guix Automation has began to build this PR tagged as v22.1.0-devpr6423.09a483dc. A new comment will be made when the image is pushed.

@DashCoreAutoGuix
Copy link

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v22.1.0-devpr6423.09a483dc. The image should be on dockerhub soon.

src/net_processing.cpp Outdated Show resolved Hide resolved
@PastaPastaPasta
Copy link
Member Author

Okay, tested on a testnet node; and appears to work as expected.

Now, disclaimer, this is on 3 testnet nodes. One running rc.1, one running this PR, and one running 6422. It is a bit hard to compare between rc.1 and the PRs as the PRs are built using debug, so their "contended time" is going to be higher, even if actual contention should be lower.

The total percentage of contentions caused by m_nodes_mutex went from ~41% -> 26%.

It did result in a significantly increased contention over m_peer_mutex going from only around 4% -> 10%

But the "contended time" for m_nodes_mutex went from 896839 -> 316444 = 580,395
while the "contended time" for m_peer_mutex went from 75043 -> 192275 = -117,232; so it seems the net "contended time" is down around ~80%

this commit should have a few benefits:
1. previous logic using ForEachNode results in locking m_nodes_mutex, a highly contended RecursiveMutex, AND m_peer_mutex(in GetPeerRef)
2. prior also resulted in calling .find over the m_peer_map for each node. Basically old logic was (probably) O(n(nlogn) the new logic results in acquiring m_peer_mutex once and looping over the list of peers, (probably) O(n)
3. Moves networking logic out of llmq/ and into actual net_processing.cpp
@PastaPastaPasta PastaPastaPasta force-pushed the refactor-net-min-m_nodes_mutex-contention-llmq branch from 09a483d to 4668db6 Compare November 22, 2024 18:29
@DashCoreAutoGuix
Copy link

Guix Automation has began to build this PR tagged as v22.1.0-devpr6423.4668db60. A new comment will be made when the image is pushed.

@@ -633,12 +633,7 @@ void CSigningManager::ProcessRecoveredSig(const std::shared_ptr<const CRecovered
WITH_LOCK(cs_pending, pendingReconstructedRecoveredSigs.erase(recoveredSig->GetHash()));

if (m_mn_activeman != nullptr) {
CInv inv(MSG_QUORUM_RECOVERED_SIG, recoveredSig->GetHash());
connman.ForEachNode([&](const CNode* pnode) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh this change is soo good! Look to this one: 6149288

Maybe you can drop all usages of CConnman from llmq code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye! Awesome, didn't even realize I'd removed all usages

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knst see: #6425 one more bites the dust!

Copy link
Collaborator

@knst knst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 86e92c3

@DashCoreAutoGuix
Copy link

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v22.1.0-devpr6423.4668db60. The image should be on dockerhub soon.

@DashCoreAutoGuix
Copy link

Guix Automation has began to build this PR tagged as v22.1.0-devpr6423.86e92c37. A new comment will be made when the image is pushed.

@DashCoreAutoGuix
Copy link

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v22.1.0-devpr6423.86e92c37. The image should be on dockerhub soon.

Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 86e92c3

@PastaPastaPasta PastaPastaPasta merged commit f9d044d into dashpay:develop Nov 22, 2024
21 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants