-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Deadlock with rayon usage #3063
Comments
This one feels like it's going to be tricky, but I'll try to investigate it soon. |
We did initial passes, but were unable to reproduce this. Putting this on a lower priority, but will keep and eye out and revisit this. |
Experienced another validator deadlock on a low resourced test network which was spammed with transactions and deployments. Evidence of it being a deadlock was that the validator's process would not terminate after sending a SIGTERM |
Is this still an issue after #3321? |
personally haven't seen this in months already after several improvement PRs were merged. Can close if the comment above from vicsn is not of any concern. |
I'm in favour of closing. This issue can be found in github history still for context when any deadlock appears again. |
🐛 Bug Report
There is a rayon-related deadlock in snarkOS, but I'm not quite sure which situation it actually is:
spawn_blocking
applies here). Maybe see this or this.I think it's probably the first one, as from a deadlock core dump, I did see write lock being acquired while the node stuck at a read lock. Here is the full backtrace of all threads. (Large text file as rayon tend to generate a deep stack. The file is actually .7z but has to be named .zip to upload here.) Notice the thread 69 has the write lock to
vm.process
while trying to advance a block, while there are many threads trying to validate incoming unconfirmed transactions and needed a read lock.Steps to Reproduce
Not sure. Run the node with a large number of connections?
Expected Behavior
The node should not deadlock.
Your Environment
The text was updated successfully, but these errors were encountered: