Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update network survey script to support V2 survey #4342

Merged
merged 3 commits into from
Jun 4, 2024

Conversation

bboston7
Copy link
Contributor

@bboston7 bboston7 commented Jun 1, 2024

Description

Resolves #4302

This change updates the network overlay survey script to support the new V2 survey API.

To test this, I also updated the simulator to simulate the new API, then I checked that the resulting graph from simulating an old pubnet survey is structurally isomorphic to the input graph. I also checked that the new fields are present in the output graph.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

Resolves stellar#4302

This change updates the network overlay survey script to support the new
V2 survey API.

To test this, I also updated the simulator to simulate the new API, then
I checked that the resulting graph from simulating an old pubnet
survey is structurally isomorphic to the input graph. I also checked
that the new fields are present in the output graph.
Copy link
Contributor

@marta-lokhova marta-lokhova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

scripts/README.md Outdated Show resolved Hide resolved
scripts/OverlaySurvey.py Show resolved Hide resolved
scripts/OverlaySurvey.py Show resolved Hide resolved
endpoint to send requests to.
"""
request_url = url_base + "/surveytopologytimesliced"
logger.info("Requesting survey data from %s peers", len(peer_list))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more request rate throttling? (previous version of this code limited number of requests per ledger to 10)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the request throttling because stellar-core has its own builtin request throttling (see SurveyManager::topOffRequests). Specifically, stellar-core sends 10 requests from its queue every 15 seconds.

Even with the old request rate throttling this queue would build up because the script would add 10 nodes every 5 seconds, but stellar-core would only send a batch of 10 every 15 seconds.

That being said, I'm happy to add back in request rate throttling as a extra safety measure if you think it's necessary. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I'm just trying to recall why we added this throttling in the first place. I agree though that it's weird to have throttling in both places, and ideally core should properly deal with all the requests. Looks like this PR added throttling because survey on the core side used to expire too quickly, I think this was related to the old -d flag. This shouldn't be the case with the new-style survey, since our reporting phase is now long by default, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Digging into it, I think without the throttling in the script there's a weird race condition that would prematurely end the survey on the stellar-core side immediately after processing a large batch of requests. Trickling in requests causes the expiration to be regularly extended and avoids the premature expiration.

The V2 survey removed the duration parameter and instead uses the survey phase to determine when the survey is finished.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. I like that in this version we move the decision to end survey to the script itself after some inactivity rather than try to end in core.

scripts/OverlaySurvey.py Outdated Show resolved Hide resolved
scripts/OverlaySurvey.py Outdated Show resolved Hide resolved
Copy link
Contributor

@marta-lokhova marta-lokhova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@marta-lokhova
Copy link
Contributor

r+ bcf2e59

@latobarita latobarita merged commit 4a7b148 into stellar:master Jun 4, 2024
15 checks passed
@bboston7 bboston7 deleted the survey-script branch September 16, 2024 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update overlay survey script to support new time sliced survey
3 participants