Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed potential deadlock in FbConnectionPoolManager #1076

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Howaner
Copy link

@Howaner Howaner commented Oct 14, 2022

Our ASP.NET Core application has been crashing about once a week lately with a lot of open threads and a deadlock when using the FbConnection.OpenAsync() function.

I now found the following deadlock:

  1. Pool.PrunePool(): Parallel.ForEach(release, x => x.Release()); is the main deadlock cause.
  2. Release() can call the Pool.ReleaseConnection() method and this method locks the same lock
  3. So now, PrunePool() is already locking the _syncRoot lock and ReleaseConnection() wants to lock the _syncRoot lock
  4. But PrunePool() is waiting for Release() to complete => Deadlock

The FbConnectionInternal.Disconnect() method is already called in different places without locking the _syncRoot lock, so I assume it can be called outside the lock.

@Howaner
Copy link
Author

Howaner commented Nov 10, 2022

@cincuranet Is there a chance to integrate this into upstream? Is there anything else I need to do?

@cincuranet
Copy link
Member

I need to review the PR first. Currently busy.

@fdcastel
Copy link
Contributor

@Howaner do you have a reproductible test case which shows the problem?

Maybe adding a test case to this PR?

@Howaner
Copy link
Author

Howaner commented Jul 12, 2023

@fdcastel It is a race condition and therefore quite difficult to trigger. I tried to reproduce it in a test project today but I failed.
The race condition occurs when using the Firebird library in an application where many things happen simultaneously.
Our application crashed every few weeks due to this bug and since we changed this few lines, the deadlock has no longer occurred.

I'm really surprised that the bug doesn't already occur much more often with other people. Probably only a few people make many simultaneous requests with Firebird.

@fdcastel
Copy link
Contributor

I understand. Surely they are a hell to reproduce.

Thank you for you feedback. Much appreciated.

@willibrandon
Copy link
Contributor

I too have started hitting this issue in our test projects after updating to version 9.1.1.0. I'll try and see if I can write a minimal reproducible test case.

@fdcastel
Copy link
Contributor

@willibrandon, @Howaner Have you had any success reproducing the issue?

@Howaner, if I understand correctly, you've been running a fork of this project in production with this PR applied. Have you never encountered any further errors since applying it?

@willibrandon, would you be able to try the same approach?

While I understand the changes @Howaner made in this PR, it's challenging to approve them based solely on verbal explanations (no offense intended, please). As engineers, I trust you both will understand.

@willibrandon
Copy link
Contributor

We saw this issue pop up in our test runs after updating the data provider version for the first time in a decade. I was not successful reproducing this issue however, and have not seen the it reoccur since.

@PaladinMattt - Do you recall the last time you saw this one?

@PaladinMattt
Copy link

It has been some time since I've noticed it. If my memory serves me correctly, it mainly happened during debugging, when calling FbConnection.ClearAllPools();, but I don't recalling seeing this issue for a few months now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants