Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add implicit SELECT FOR UPDATE to initial scan of DELETE #137069

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Dec 10, 2024

This commit makes it so that we apply implicit SELECT FOR UPDATE locking behavior to the initial scan of the DELETE operation in some cases. Namely, we do so when the input to the DELETE is either a scan or an index join on top of a scan (with any number of renders on top) - these conditions mean that all filters were pushed down into the scan, so we won't lock any unnecessary rows. I think it's only possible to have at most one render expression on top of the scan, but I chose to be defensive and allowed nested renders too. In such form the conditions are exactly the same as we use for adding SFU to UPDATEs, so the same function is reused. Existing enable_implicit_select_for_update session variable is consulted.

Fixes: #50181.

Release note (sql change): DELETE statements now acquire locks using the FOR UPDATE locking mode during their initial row scan in some comes, which improves performance for contended workloads. This behavior is configurable using the enable_implicit_select_for_update session variable.

@yuzefovich yuzefovich requested review from michae2 and a team December 10, 2024 01:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2)


pkg/sql/opt/exec/execbuilder/mutation.go line 1238 at r1 (raw file):

// DELETE statement. If the builder should lock the initial row scan, it returns
// the TableID of the scan, otherwise it returns 0.
func (b *Builder) shouldApplyImplicitLockingToDeleteInput(del *memo.DeleteExpr) opt.TableID {

Given that the code is an exact copy of the UPDATE case, should we share the same helper between two mutations? I'm currently leaning towards keeping them separate since the amount of duplication is very low and it seems cleaner this way.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That was easier than I realized it would be!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @yuzefovich)


pkg/sql/opt/exec/execbuilder/mutation.go line 1238 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Given that the code is an exact copy of the UPDATE case, should we share the same helper between two mutations? I'm currently leaning towards keeping them separate since the amount of duplication is very low and it seems cleaner this way.

Coincidentally, @mgartner just changed this function for updates during the collab session today, so actually maybe it would be good to share the helper function?


pkg/sql/opt/exec/execbuilder/mutation.go line 1134 at r2 (raw file):

	mutExpr memo.RelExpr,
) (opt.TableID, error) {
	if !b.evalCtx.SessionData().ImplicitSelectForUpdate {

Does moving the call here also affect INSERT INTO SELECT FROM statements? I don't think we would want to affect those.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 14 of 14 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)


pkg/sql/opt/exec/execbuilder/mutation.go line 1238 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Coincidentally, @mgartner just changed this function for updates during the collab session today, so actually maybe it would be good to share the helper function?

Either way works for me. If you want to combine, it shouldn't be too much work because the function can take a memo.RelExpr and you can use .Child(0).(RelExpr) to access the input.

I'll hold off on merging my PR until this one merges.


pkg/sql/opt/exec/execbuilder/testdata/delete line 347 at r2 (raw file):

              spans: /1-/1000 /2001-/3000
              locking strength: for update

nit: Some tests cases specifically for testing the different cases of implicit FOR UPDATE locking might be nice. Up to you.

This commit makes it so that we apply implicit SELECT FOR UPDATE locking
behavior to the initial scan of the DELETE operation in some cases.
Namely, we do so when the input to the DELETE is either a scan or an
index join on top of a scan (with any number of renders on top) - these
conditions mean that all filters were pushed down into the scan, so we
won't lock any unnecessary rows. I think it's only possible to have at
most one render expression on top of the scan, but I chose to be
defensive and allowed nested renders too. In such form the conditions
are exactly the same as we use for adding SFU to UPDATEs, so the same
function is reused. Existing `enable_implicit_select_for_update` session
variable is consulted.

Release note (sql change): DELETE statements now acquire locks using the FOR
UPDATE locking mode during their initial row scan in some comes, which
improves performance for contended workloads. This behavior is configurable
using the `enable_implicit_select_for_update` session variable.
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)


pkg/sql/opt/exec/execbuilder/mutation.go line 1238 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Either way works for me. If you want to combine, it shouldn't be too much work because the function can take a memo.RelExpr and you can use .Child(0).(RelExpr) to access the input.

I'll hold off on merging my PR until this one merges.

Ack, combined them into one.


pkg/sql/opt/exec/execbuilder/mutation.go line 1134 at r2 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Does moving the call here also affect INSERT INTO SELECT FROM statements? I don't think we would want to affect those.

We get exactly the same behavior for INSERTs as before - it has the same return values as this clause, so this change should be a no-op for INSERTs.


pkg/sql/opt/exec/execbuilder/testdata/delete line 347 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: Some tests cases specifically for testing the different cases of implicit FOR UPDATE locking might be nice. Up to you.

I think existing tests have decent variability covering any simple cases that I would add already.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Awesome.

Reviewed 13 of 14 files at r1, 5 of 5 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)

@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig craig bot merged commit 32886df into cockroachdb:master Dec 13, 2024
22 checks passed
@yuzefovich yuzefovich deleted the delete-sfu branch December 13, 2024 00:36
kyle-a-wong added a commit to kyle-a-wong/cockroach that referenced this pull request Jan 2, 2025
This test started failing after cockroachdb#137069 was
merged into mater. That PR introduced a change
that made it so that a lock is acquired when
performing the initial scan for a DELETE,
over all scanned rows. As a result, a race
condition was exposed in TestLogGC (specifically
when the deadlock flag is set) where calling
`gcSystemLog` resulted in a
`TransactionRetryError: retry txn` error.

This race condition is happening because the
test data being written is using timestamps
in the future as opposed to the past. When
attemping to GC the table, its conflict with
normal writes that are happening at the
same time. To fix, the test data is being
written in the past, which avoids these
conflicts.

Fixes: cockroachdb#137490
Release note: None
<pkg>: <short description - lowercase, no final period>

<what was there before: Previously, ...>
<why it needed to change: This was inadequate because ...>
<what you did about it: To address this, this patch ...>
kyle-a-wong added a commit to kyle-a-wong/cockroach that referenced this pull request Jan 2, 2025
This test started failing after cockroachdb#137069 was
merged into mater. That PR introduced a change
that made it so that a lock is acquired when
performing the initial scan for a DELETE,
over all scanned rows. As a result, a race
condition was exposed in TestLogGC (specifically
when the deadlock flag is set) where calling
`gcSystemLog` resulted in a
`TransactionRetryError: retry txn` error.

This race condition is happening because the
test data being written is using timestamps
in the future as opposed to the past. When
attemping to GC the table, its conflict with
normal writes that are happening at the
same time. To fix, the test data is being
written in the past, which avoids these
conflicts.

Fixes: cockroachdb#137490
Release note: None
craig bot pushed a commit that referenced this pull request Jan 2, 2025
138174: server: fix TestLogGC test r=kyle-a-wong a=kyle-a-wong

This test started failing after #137069 was merged into mater. That PR introduced a change that made it so that a lock is acquired when performing the initial scan for a DELETE, over all scanned rows. As a result, a race condition was exposed in TestLogGC (specifically when the deadlock flag is set) where calling `gcSystemLog` resulted in a `TransactionRetryError: retry txn` error.
 
This race condition is happening because thetest data being written is using timestamps in the future as opposed to the past. When attemping to GC the table, its conflict with normal writes that are happening at the same time. To fix, the test data is being written in the past, which avoids these conflicts.

Fixes: #137490
Release note: None

Co-authored-by: Kyle Wong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/opt: add implicit SELECT FOR UPDATE support for DELETE statements
4 participants