-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crl-release-20.2: db: mergingIter.SeekPrefixGE stops early if prefix cannot match #1100
crl-release-20.2: db: mergingIter.SeekPrefixGE stops early if prefix cannot match #1100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part of the backport didn't apply cleanly? The change to merging_iter.go
looks pretty straightforward.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @andreimatei, @itsbilal, @nvanbenschoten, and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part of the backport didn't apply cleanly? The change to merging_iter.go looks pretty straightforward.
Mostly test code and datadriven test expectations. Specifically, the seek-prefix-ge c
result changed.
I think the change in expectations makes sense, because we don't have some of the iterator changes Sumeer made and both outputs are valid within the internalIterator
's contract. @sumeerbhola's commentary within the datadriven test explains why on master the test is allowed to return d
for seek-prefix-ge c
, but it's equally valid to return not found.
# prefix, and when it steps to d, it finds d is not deleted. Note that
# mergingIter is an InternalIterator and does not need to guarantee prefix
# match -- that is job of the higher-level Iterator. So "seek-prefix-ge c" is
# allowed to return d.
If you're comfortable with this since the actual non-test code changes are straightforward and minimal, we could merge it today.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @andreimatei, @itsbilal, @nvanbenschoten, and @sumeerbhola)
testdata/merging_iter, line 529 at r1 (raw file):
Quoted 4 lines of code…
# prefix, and when it steps to d, it finds d is not deleted. Note that # mergingIter is an InternalIterator and does not need to guarantee prefix # match -- that is job of the higher-level Iterator. So "seek-prefix-ge c" is # allowed to return d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 6 files at r1.
Reviewable status: 2 of 6 files reviewed, 1 unresolved discussion (waiting on @andreimatei, @nvanbenschoten, and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for backporting this!
Reviewed 6 of 6 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @andreimatei, @jbowens, and @nvanbenschoten)
testdata/merging_iter, line 529 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
# prefix, and when it steps to d, it finds d is not deleted. Note that # mergingIter is an InternalIterator and does not need to guarantee prefix # match -- that is job of the higher-level Iterator. So "seek-prefix-ge c" is # allowed to return d.
Can you update the comment here since seek-prefix-ge c
is not finding d.
This is motivated by the CockroachDB issues discussed in cockroachdb#1070 where range tombstones in L6 cause the iterator to go through all the deleted data. The situation is even worse in that each successive SeekPrefixGE in a batch request (which is in sorted order) will typically have to start all over again since the file at which the seek finally ended its work is probably later than the file which contains the relevant key. Note that CockroachDB does not set bounds when using SeekPrefixGE because the assumption is that the underlying Pebble implementation can imply bounds, and we don't want to change this behavior in CockroachDB. Since CockroachDB primarily uses range tombstones when deleting CockroachDB ranges, the SeekPrefixGE calls that were slow were probably on a preceding CockroachDB range, hence stopping early, as done in this PR, should fix the issue. If range tombstones were used within the CockroachDB range that is being read using SeekPrefixGE, the seek could land on a deleted key and will need to iterate through all its MVCC versions. Even in that case, the work is bounded by the number of deleted versions of a single MVCC key. The code stops early in prefix iteration mode, both for SeekPrefixGE and Next. BenchmarkIteratorSeqSeekPrefixGENotFound demonstrates the existing problem when with-tombstone=true. name old time/op new time/op delta IteratorSeqSeekPrefixGENotFound/skip=1/with-tombstone=false-16 446ns ± 0% 433ns ± 0% -2.91% IteratorSeqSeekPrefixGENotFound/skip=1/with-tombstone=true-16 10.3ms ± 0% 0.0ms ± 0% -99.99% IteratorSeqSeekPrefixGENotFound/skip=2/with-tombstone=false-16 416ns ± 0% 429ns ± 0% +3.12% IteratorSeqSeekPrefixGENotFound/skip=2/with-tombstone=true-16 10.6ms ± 0% 0.0ms ± 0% -99.99% IteratorSeqSeekPrefixGENotFound/skip=4/with-tombstone=false-16 414ns ± 0% 437ns ± 0% +5.56% IteratorSeqSeekPrefixGENotFound/skip=4/with-tombstone=true-16 10.5ms ± 0% 0.0ms ± 0% -99.99% MergingIterSeqSeekPrefixGE/skip=1/use-next=false-16 1.65µs ± 0% 1.75µs ± 0% +5.75% MergingIterSeqSeekPrefixGE/skip=1/use-next=true-16 463ns ± 0% 459ns ± 0% -0.86% MergingIterSeqSeekPrefixGE/skip=2/use-next=false-16 1.61µs ± 0% 1.67µs ± 0% +3.73% MergingIterSeqSeekPrefixGE/skip=2/use-next=true-16 476ns ± 0% 475ns ± 0% -0.21% MergingIterSeqSeekPrefixGE/skip=4/use-next=false-16 1.62µs ± 0% 1.77µs ± 0% +9.26% MergingIterSeqSeekPrefixGE/skip=4/use-next=true-16 513ns ± 0% 525ns ± 0% +2.34% MergingIterSeqSeekPrefixGE/skip=8/use-next=false-16 1.71µs ± 0% 1.84µs ± 0% +7.65% MergingIterSeqSeekPrefixGE/skip=8/use-next=true-16 1.10µs ± 0% 1.16µs ± 0% +5.27% MergingIterSeqSeekPrefixGE/skip=16/use-next=false-16 1.80µs ± 0% 1.86µs ± 0% +2.99% MergingIterSeqSeekPrefixGE/skip=16/use-next=true-16 1.34µs ± 0% 1.20µs ± 0% -10.23% name old alloc/op new alloc/op delta IteratorSeqSeekPrefixGENotFound/skip=1/with-tombstone=false-16 0.00B 0.00B 0.00% IteratorSeqSeekPrefixGENotFound/skip=1/with-tombstone=true-16 300B ± 0% 0B -100.00% IteratorSeqSeekPrefixGENotFound/skip=2/with-tombstone=false-16 0.00B 0.00B 0.00% IteratorSeqSeekPrefixGENotFound/skip=2/with-tombstone=true-16 300B ± 0% 0B -100.00% IteratorSeqSeekPrefixGENotFound/skip=4/with-tombstone=false-16 0.00B 0.00B 0.00% IteratorSeqSeekPrefixGENotFound/skip=4/with-tombstone=true-16 292B ± 0% 0B -100.00% MergingIterSeqSeekPrefixGE/skip=1/use-next=false-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=1/use-next=true-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=2/use-next=false-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=2/use-next=true-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=4/use-next=false-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=4/use-next=true-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=8/use-next=false-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=8/use-next=true-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=16/use-next=false-16 0.00B 0.00B 0.00% MergingIterSeqSeekPrefixGE/skip=16/use-next=true-16 0.00B 0.00B 0.00% Informs cockroachdb#1070
4cea4cd
to
d92577b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: 5 of 6 files reviewed, all discussions resolved (waiting on @andreimatei, @nvanbenschoten, and @sumeerbhola)
testdata/merging_iter, line 529 at r1 (raw file):
Previously, sumeerbhola wrote…
Can you update the comment here since
seek-prefix-ge c
is not finding d.
Done
This is a 20.2 backport of #1085. While investigating cockroachlabs/support#898,
one node hit #1070:
The cherrypick didn't apply cleanly because of other recent iterator optimizations,
so I think we should probably wait for @sumeerbhola to give this a review before
merging.