Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't ignore close in the pool when using force #406

Merged
merged 2 commits into from
Dec 26, 2024

Conversation

davidmartos96
Copy link
Contributor

@davidmartos96 davidmartos96 commented Dec 23, 2024

@isoos There was a pending TODO from my force close implementation in the pool. I believe it's necessary. Otherwise the close call is not actually sent, right?

The tests in https://github.com/davidmartos96/demo_postgres_shelf are failing without these changes. To catch this in the tests I think we would need to run an empty transaction call, an let it open for a long time, with a Future.delayed and then check that the Postgres server has cleaned up the connection from pg_stat_activity.

Related to that, is the current force close implementation suppose to stop running queries like pg_cancel_backend or pg_terminate_backend do? Or does it just close the socket? It looks like with the current implementation, pg_sleep from the tests is not stopped, rather, no one is listening to the result. Is that correct?

@isoos
Copy link
Owner

isoos commented Dec 23, 2024

Have you checked these tests? It handles the opening time latency with a completer that completes after the connection / session / transaction is open:
https://github.com/isoos/postgresql-dart/blob/master/test/pool_test.dart#L240-L248

The force call is handled in the _dispose() method, and I think the pool connection's close(force: true) at best should mark the connection to be closed. But to be honest, I'm not entirely convinced that it is better than ignoring it. There could be use-cases where ignoring seems to be better, similarly when session.close() is called.

@davidmartos96
Copy link
Contributor Author

@isoos Yes, I've seen the test.

I believe I know what the problem is. My original goal was to support aborting the connection from the pool if an HTTP request times out, without actually force closing the whole pool.
For that I'm using the custom pool implementation shared in davidmartos96/demo_postgres_shelf and in #394 .

Would it make sense to expose dispose for this kind of use case?

This is the important snippet, that overrides the default withConnection method:

/// This method is the only one that is overridden to add the abort signal handling.
  /// and let the postgres library know about the error outside the connection callback
  /// so that it releases the connection.
  @override
  Future<R> withConnection<R>(
    Future<R> Function(pg.Connection connection) fn, {
    pg.ConnectionSettings? settings,
    locality,
  }) {
    final AbortSignal? abortSignal = getAbortSignalInZone();

    final Future<R> Function(pg.Connection connection) effectiveFn;
    if (abortSignal == null) {
      effectiveFn = fn;
    } else {
      effectiveFn = (connection) async {
        try {
		  // Wait for fn, or for an abort, whichever comes first
          return await abortSignal.waitFuture(fn(connection));
        } catch (e) {
          if (abortSignal.wasAborted) {
            // Interrupt the pg connection if the http request has finished
            // to avoid dangling connections
            
            // This might need to be a call to dispose instead of close
            unawaited(connection.close(force: true));
          }
          rethrow;
        }
      };
    }

    return super
        .withConnection(effectiveFn, settings: settings, locality: locality);
  }

@isoos
Copy link
Owner

isoos commented Dec 24, 2024

My original goal was to support aborting the connection from the pool if an HTTP request times out, without actually force closing the whole pool.

Not sure: how are the two (HTTP times out, connection closing) are related? Are you using a single connection per HTTP request? It should be instead only using the pool for the operations, and don't assign a single connection for the entire HTTP request lifecycle.

The regular connection (regardless if it is coming from the pool) now also accepts the timeout and it will cancel the actual statement through a server callback. So if you want to protect your HTTP response latencies, that may be the best way for it.

Other than that, using one or more of the maxConnectionAge, maxSessionUse, and maxQueryCount would give you a good control on refreshing the connection instances in a timely manner.

Still, on top of these, I think it could be useful to keep track of the pooled connection's close method, flag it for not reuse and then actually not reuse it further. But with the current API design it would seem too abrupt to actually closing it, instead, one should return it to the pool or use timeouts.

@davidmartos96
Copy link
Contributor Author

davidmartos96 commented Dec 26, 2024

@isoos They are somewhat related. The connection would only be closed if the HTTP request times out. A normal situation where the HTTP request interacts with the pool, retrieves a result and responds, wouldn't be closed, and the pool would be working normally.

The context from this is that a few months ago, during a db upgrade maintainance we encountered that the HTTP API was timing out requests because the db wasn't working at that moment (expected), but db connections were queing up, and waiting to execute, reaching the limit of concurrent postgres connections. We want to avoid this from happening again, so our idea is to let postgres know that the connection in the pool is not valid, becuase no one is listening to it, in case that the http request is timing out for whatever reason.

Setting timeout to each query could work, but it would require configuring every single access to the database. With an API middleware we can configure a more general solution for timeouts and closing/cancelling the db connection in case it happens.

@isoos
Copy link
Owner

isoos commented Dec 26, 2024

You can set the queryTimeout on the session-, connection- and on the pool settings too:
https://pub.dev/documentation/postgres/latest/postgres/PoolSettings-class.html
https://pub.dev/documentation/postgres/latest/postgres/ConnectionSettings-class.html

The actual query execution will use the innermost set value for the query, for the session, for the connection or for the pool it was set. I think if you set it to a minute at pool level, and override it on demand if it needs to be longer could be a fair limit. So you can set it once and it will be applied for all down the given hierarchy.

I still don't see a per-request connection object as a good design pattern, and wouldn't encourage anything that supports it. One of the benefits of the pooling is that it can share the resource between concurrent requests, and connections won't stand idle most of the time.

I'm okay on marking the need to close the connection once it is returned to the pool, because that may be something the query sets in the session context and may be hard to reset automatically.

@davidmartos96
Copy link
Contributor Author

@isoos

it can share the resource between concurrent requests

Is that how it works? I thought that concurrent requests would spawn a separate db connection. Doesn't the withConnection method in the pool have a built in lock, preventing other calls while running?

@isoos
Copy link
Owner

isoos commented Dec 26, 2024

it can share the resource between concurrent requests

Is that how it works? I thought that concurrent requests would spawn a separate db connection. Doesn't the withConnection method in the pool have a built in lock, preventing other calls while running?

You can do it in different ways, e.g. the model that was common in the PHP world: you open a connection for each request, and keep it alive until the request is processed. I am opposed to that. You have many parts of the request processing that doesn't need a database: accessing cache, verifying tokens, some decision logic, some formatting and building the output. So maybe 40-70% of the request is spent in a database connection.

Instead, what you do is you allocate the database connection only when you actually need it, and only for the time you need it. If you have a logic that requires query -> light processing -> another query, you don't allocate the connection for the entire 3 steps, unless if it needs to be in a transaction (but then use runTx). And a pool can help you with that: each time you want to do a database operation, you do a run or runTx with the pool, execute the thing you want and then yield the connection back to the pool so another request can also use it.

If you have an application running on 8 isolates and in each of them a database pool of 10, it means you will limit the total connections to the database to 80, which can be easily served on a mid-sized server, and maybe can serve like 120-200 concurrent HTTP requests without noticeable congestion.

You may also include pgbouncer in the mix, but that is another level of pool indirection, which may or may not be useful.

@davidmartos96
Copy link
Contributor Author

Thanks for the details @isoos !
But I believe there was a misunderstanding 🙂

The demo shared in the repo above doesn't implement it the PHP way, opening a connection on every HTTP request, I agree that would be bad.
What it does is subclass the current pool implementation, and override the withConnection function to "early break" the callback in case someone in the current dart zone (HTTP timeout/500 server error) would call AbortSignal

The server code would be as you would expect, run/runTx only where you needed and as short as possible. What changes is that if the signal is triggered and the current request is in the middle of a database run/runTx callback, it breaks the callback and closes the pool connection.

It shouldn't affect normal behavior of the app unless I'm overlooking something

@isoos
Copy link
Owner

isoos commented Dec 26, 2024

Thanks for the clarification! I get now that you have a total time budget for the request and want to actively abort it as you go ahead with the processing. I think the subclassing makes sense for this case, but I think you may want to just abort the current statement, and not the entire connection... or is the connection object suspect being a resource hog at this point?

Whether close does it right away or in the dispose part wouldn't matter much, as it is semantically the same with just a few calls in-between them. The mark of the connection would happen at the end of this call (inside the subclass):
https://github.com/isoos/postgresql-dart/blob/master/lib/src/pool/pool_impl.dart#L157

And then down a few lines this condition can include isMarkedForClosing:
https://github.com/isoos/postgresql-dart/blob/master/lib/src/pool/pool_impl.dart#L170

As there is always a chance that the connection gets broken while in use, exposing the force-close wouldn't make much for a more defensive scenario. So all-in-all, I'll be happy with the current PR too. Do you see any good test case so we make sure we don't accidentally break this feature?

@davidmartos96
Copy link
Contributor Author

davidmartos96 commented Dec 26, 2024

@isoos Great to hear we are on the same page now 😄

I'm not sure I'm following your code suggestions so feel free to edit the PR/create a separate commit.

I think you may want to just abort the current statement

That was our original idea, but it seemed harder to implement than breaking with an Exception in the withConnection callback. For our usecase closing and reconnecting shouldn't be too problematic as there is very low latency between API and db. Is there any way in the library to abort the current running statement (if there is one)?

Regarding a test case, the ones in the shared repo are forced situations, where the db is not used properly, like not awaiting before sending an HTTP response. The HTTP router calls the "AbortSignal" to close the connection if there is any. If there is a way to abort the running statement I guess it could be tested by checking the "pg_stat_activity" table if the current connection is listed there, looking by "application_name"

CHANGELOG.md Outdated Show resolved Hide resolved
@isoos isoos merged commit 090a459 into isoos:master Dec 26, 2024
1 check passed
@isoos
Copy link
Owner

isoos commented Dec 26, 2024

I've added a quick test for the withConnection callback separately and published the new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants