Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Signer] Race condition in changing sortition view vs validaiton endpoint causing reorg #5496

Closed
jferrant opened this issue Nov 22, 2024 · 1 comment
Assignees
Milestone

Comments

@jferrant
Copy link
Collaborator

jferrant commented Nov 22, 2024

From @obycode investigating Nakamoto testnet he saw the following scenario:

- 07:03:16.298 Signer #0 receives the proposal for 2465/6663
- 07:03:57.517 Signer submits block for validation
- 07:03:58.613 Node begins validating block
- 07:03:59.162 Signer sees burn block 2466
- 07:03:59.166 Signer receives proposal for 2466/6663
- 07:04:07.801 Node finishes validating block
- 07:05:39.845 Signer reports that it cannot submit 2466/6663 for validation since it is waiting for another block validation
- 07:05:39.850 Signer receives block validation for 2465/6663
- 07:05:39.922 Signer sends block response (accepted) for 2465/6663
- 07:05:40.750 Signer sees burn block 2467
- 07:05:40.754 Signer receives block proposal for 2467/6663
- 07:06:42.914 Signer reports that last miner timed out, marking as invalid
- 07:07:02.478 Signer reports "Most recent miner's tenure does not build off the prior sortition, checking if this is valid behavior"
- 07:07:21.717 Signer submits block 2467/6663 for validation
- 07:07:22.204 Signer receives block validation for 2466/6663
- 07:07:22.206 Signer sends block response (accepted) for 2466/6663
- 07:07:23.042 Signer receives block validation for 2467/6663
- 07:07:23.045 Signer sends block response (accepted) for 2467/6663

I looked into it and this is what I think is happening:

So both 2467/6663 and 2466/6663 attempt to reorg 2465/6663...which is correct. From the viewpoint of the signer, both proposals are valid. We don't have any locally accepted 2466/6663 at the time of 2467/6663 being proposed. I.e. they both are checked against our globally accepted and locally accepted blocks...they both pass that initial check. Then we get the validation back of the first submission. It is sitting in the process_event queue. We then get the second proposal and again it passes the checks and is submitted to the validation endpoint (successfully as the other was already finished processing). Then the validation response is handled for 2466/6663 and it becomes locally approved. Then the validaiton response is handled for 2467/6663 and is AUTO approved without checking against the potential of a view change. We need an additional check after getting the validation back to make sure nothing has happened in between.

@jferrant jferrant self-assigned this Nov 22, 2024
@github-project-automation github-project-automation bot moved this to Status: 🆕 New in Stacks Core Eng Nov 22, 2024
@aldur aldur added this to the 3.1.0.0.3 milestone Dec 16, 2024
@aldur aldur moved this from Status: 🆕 New to Status: In Review in Stacks Core Eng Dec 16, 2024
@aldur aldur moved this from Status: In Review to Status: ✅ Done in Stacks Core Eng Jan 7, 2025
@jferrant jferrant closed this as completed Jan 8, 2025
@blockstack-devops
Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Jan 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Status: ✅ Done
Development

No branches or pull requests

3 participants