Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
When the binding idles for 2 ish minutes, the server sends an
IDLE_SESSION_TIMEOUT_ERROR
error indicating that the binding should disconnect until another query is ready to be run.The bindings logic would reset the state of the connection and only attempt to reconnect on a query call, the issue lies with the implementation of this logic with Netty. When the client resets the promises for indicating both when the client is connected as well as the ready state (when the binding receives
READY_FOR_COMMAND
message). The issue lies with the connection ready state, since our code looks like the following psuedo-code:we rely on the connection ready event to preform reconnect logic, the reason for this is the following states:
During reconnect logic, if the binding detected a disconnect it would call the client
connect
method, which is responsible for resetting the connection state which includes both promises, the way it would reset the connection promise is by triggering a custom Netty pipeline event that resets the promise:The issue was that this pipeline event wasn't triggered synchronously with the connection reset code, so ultimately it would keep the old promise (which has been completed) and follow the reconnect cycle again which is controlled by a lock, causing a deadlock.
This PR fixes that by making the reset promise synchronous within the pipeline by making the callee run the reset instead of the netty pipeline thread.