release-21.1: kvserver: use ClearRawRange
to truncate very large Raft logs
#75980
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #75793.
/cc @cockroachdb/release
Release justification: possible node crash loop.
Raft log truncation was done using point deletes in a single Pebble
batch. If the number of entries to truncate was very large, this could
result in overflowing the Pebble batch, causing the node to panic. This
has been seen to happen e.g. when the snapshot rate was set very low,
effectively stalling snapshot transfers which in turn held up log
truncations for extended periods of time.
This patch uses a Pebble range tombstone if the number of entries to
truncate is very large (>100k). In most common cases, point deletes are
still used, to avoid writing too many range tombstones to Pebble.
Release note (bug fix): Fixed a bug which could cause nodes to crash
when truncating abnormally large Raft logs.