Retryable transactions + async exception handling #1482

jship · 2023-03-14T19:36:27Z

The main change in this PR is adding support for retryable transactions. For example, when using the various runSqlPool* functions at a repeatable-read or serializable isolation level, application authors need a means to retry transactions that encounter serialization failures. Without official support for this in persistent, runSqlPool* users could technically still manually catch a serialization failure exception around the whole runSqlPool* and retry the transaction, but this is a nonstarter as the connection would need to be returned then reacquired from the pool. To support retryable transactions, a runSqlPoolWithExtensibleHooksRetry function was added that takes in an exception predicate to determine retrying the transaction. The existing runSqlPoolWithExtensibleHooks is now implemented in terms of the new function.

Another change in this PR is around runSqlPoolWithExtensibleHooks's async exception handling. The previous version of this function would not run the runOnException hook when the user-specified database action was aborted via async exception, as async exceptions were ignored entirely due to use of unliftio's catchAny. This is not as problematic as it sounds on the surface: if an async exception came in, the enclosing withResource on the pool would catch it and terminate the connection. For PostgreSQL, when the connection is terminated, the database discards whatever changes were made in the transaction even though there was no explicit rollback. With the change in this PR, if users have custom logic defined in their runOnException hook on top of just rolling back the transaction, they should now be able to rely on persistent to execute this hook when the user-specified database action encounters any type of exception.

There were also multiple spots where the masking state was being restored (basically everything was in a restore except for the installation of the catchAny handler). The masking has been changed in this PR such that alterBackend is still in a restore, as is the user-specified action, but runBefore and runAfter are no longer in a restore. The previous version's restore usage came in from #1207, so I verified that the conn-killed binary still produces Right with the new masking. Additionally, it's worth noting that runOnException is now implicitly in an uninterruptibleMask (via unliftio's withException).

The PR might be easiest to review commit-by-commit, as intentionally failing tests were added prior to changes being made to the libraries.

Before submitting your PR, check that you've:

Documented new APIs with Haddock markup
Added @since declarations to the Haddock
Ran stylish-haskell on any changed files.
Adhered to the code style (see the .editorconfig file for details)

After submitting your PR:

Update the Changelog.md file with a link to your PR
Bumped the version number if there isn't an (unreleased) on the Changelog
Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

jship · 2023-03-14T19:55:41Z

persistent/Database/Persist/Sql/Run.hs

+-- @since 2.14.6.0
+runSqlPoolWithExtensibleHooksRetry
+    :: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend)
+    => (UE.SomeException -> Bool)


It's worth pointing out that the API has been intentionally kept simple here: so long as synchronous exceptions match the predicate, the transaction will always be retried. In the case of serialization failures at repeatable-read and serializable isolation specifically, the recommendation from the PostgreSQL docs and elsewhere is to retry these failing transactions unconditionally.

That guidance is what drove this simpler API, but users may have other failures (e.g. uniqueness violations) they would like to only retry up to a fixed number of times or retry based on some more sophisticated policy. The simple exception predicate approach in this PR would not work for those more complicated cases. Considering the more complicated cases are rare, it seemed prudent to keep the retry API as simple as possible for now. However, depending upon the exception predicate the user specifies, they may get themselves into a situation where persistent indefinitely retries. For these cases, they could wrap their runSqlPoolWithExtensibleHooksRetry call in a timeout.

jship · 2023-06-09T16:54:31Z

We've had success running these changes in production for a few months now. Is there anything I can help with in regards to moving the PR along?

…level in test infra

…t case This test case will fail until async exception handling is added to runSqlPoolWithExtensibleHooks

…ooks Note that the test added in the previous commit now passes.

…n handler's masking state

…ns test case This test case will fail until retryable transaction support is added in a new runSqlPoolWithExtensibleHooks variant.

…resql: Add isSerializationFailure and isDeadlockDetected Note that the test added in the previous commit now passes.

…syncExceptionsTest to here

… counts

jship commented Mar 14, 2023

View reviewed changes

parsonsmatt self-requested a review June 9, 2023 22:21

jship added 12 commits July 6, 2023 08:55

persistent: Export modifyRunOnException and setRunOnException

6a7f9e8

persistent-postgresql: Support specifying SqlPoolHooks and isolation …

6f2a25f

…level in test infra

persistent-postgresql: Add intentionally failing async exceptions tes…

fe119f9

…t case This test case will fail until async exception handling is added to runSqlPoolWithExtensibleHooks

persistent: Add async exception handling to runSqlPoolWithExtensibleH…

2d1c937

…ooks Note that the test added in the previous commit now passes.

persistent-postgresql: Extend async exceptions test to check exceptio…

35d8301

…n handler's masking state

persistent-postgresql: Add intentionally failing retryable transactio…

c2a74f0

…ns test case This test case will fail until retryable transaction support is added in a new runSqlPoolWithExtensibleHooks variant.

persistent: Add runSqlPoolWithExtensibleHooksRetry / persistent-postg…

deb9abb

…resql: Add isSerializationFailure and isDeadlockDetected Note that the test added in the previous commit now passes.

persistent-test: Move hook count helpers from persistent-postgresql A…

fa46ad8

…syncExceptionsTest to here

persistent-postgresql: Extend RetryableTransactionTest to verify hook…

395d80e

… counts

Run stylish-haskell

92a11d3

persistent/persistent-postgresql: Add PR links to changelogs

5d5ed28

persistent/persistent-postgresql: Bump versions to match changelogs

0c165fd

jship force-pushed the retryable-transactions-and-async-exception-stuff branch from 91b68a4 to 0c165fd Compare July 6, 2023 13:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retryable transactions + async exception handling #1482

Retryable transactions + async exception handling #1482

jship commented Mar 14, 2023 •

edited

Loading

jship Mar 14, 2023 •

edited

Loading

jship commented Jun 9, 2023

Retryable transactions + async exception handling #1482

Are you sure you want to change the base?

Retryable transactions + async exception handling #1482

Conversation

jship commented Mar 14, 2023 • edited Loading

jship Mar 14, 2023 • edited Loading

Choose a reason for hiding this comment

jship commented Jun 9, 2023

jship commented Mar 14, 2023 •

edited

Loading

jship Mar 14, 2023 •

edited

Loading