Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication: Optimize replication on target tablets #17166

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Nov 7, 2024

Description

This PR makes two performance optimizations affecting execution time and resource usage:

  1. For the copy phase: optimize the generation of row values that are used to build bulk INSERT statements
  2. For the running/replicating phase: enable the VPlayerBatching feature by default. This was added in v18 in VReplication VPlayer: support statement and transaction batching #14502. It has been enabled in a number of relevant endtoend tests ever since and it has been used in PlanetScale for a number of important workflows. Enabling it early in the v22 lifecycle also gives us a good 3-6 months for it to bake further on main.

Summary:

  • Copy phase
    • The time it took to finish the copy phase went from 548 seconds to 489: approximately an 11% improvement
    • The tablet memory usage/allocations went from 19,842.93MiB to 2,667.88MiB: approximately a 7.43x improvement
  • Running phase
    • The time it took to catch up when subsequently inserting another 10,000,000 rows after the copy phase ended went from 270 seconds to 40 seconds: approximately a 6.75x improvement

The detailed results of the testing can be seen here: https://gist.github.com/mattlord/d6161a1c27f203a497b7bd6a69e7c467

The test setup was created this way:

cd examples/local
alias vtctldclient='command vtctldclient --server=localhost:15999'

# Enable the pprof endpoints
diff --git a/examples/common/scripts/vttablet-up.sh b/examples/common/scripts/vttablet-up.sh
index daa40aee89..282cd0553e 100755
--- a/examples/common/scripts/vttablet-up.sh
+++ b/examples/common/scripts/vttablet-up.sh
@@ -54,6 +54,7 @@ vttablet \
  --service_map 'grpc-queryservice,grpc-tabletmanager,grpc-updatestream' \
  --pid_file $VTDATAROOT/$tablet_dir/vttablet.pid \
  --heartbeat_on_demand_duration=5s \
+ --pprof-http \
  > $VTDATAROOT/$tablet_dir/vttablet.out 2>&1 &



# Setup function to time the copy phase completion
function wait_for_workflow_running() {
    local keyspace=customer
    local workflow=commerce2customer
    local wait_secs=900
    local result=""

    echo "Waiting for the ${workflow} workflow in the ${keyspace} keyspace to finish the copy phase..."

    for _ in $(seq 1 ${wait_secs}); do
        result=$(vtctldclient Workflow --keyspace="${keyspace}" show --workflow="${workflow}" 2>/dev/null | grep "Copy phase completed")
        if [[ ${result} != "" ]]; then
            break
        fi
        sleep 1
    done;

    if [[ ${result} == "" ]]; then
        echo "Timed out after ${wait_secs} seconds waiting for the ${workflow} workflow in the ${keyspace} keyspace to reach the running state"
    else
        echo "The ${workflow} workflow in the ${keyspace} keyspace is now running. $(sed -rn 's/.*"(Copy phase.*)".*/\1/p' <<< "${result}")."
    fi
}


# Setup the commerce keyspace
./101_initial_cluster.sh


# Load data in the customer table
table_file="${VTDATAROOT}/vt_0000000100/data/vt_commerce/customer.ibd"
commerce_primary_uid=$(vtctldclient GetTablets --keyspace commerce --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)

# Generate 5MiB of initial data
size=$((5*1024*1024))
while [[ $(stat -f "%z" "${table_file}") -lt ${size} ]]; do
    command mysql -u root --socket "${VTDATAROOT}/vt_0000000${commerce_primary_uid}/mysql.sock" vt_commerce -e "insert into customer (customer_id, email) values (${RANDOM}*${RANDOM}, '${RANDOM}[email protected]')" 2> /dev/null
done

say "Initial data load completed"

# Grow that to at least 10GiB
size=$((10*1024*1024*1024))
i=1
while [[ $(stat -f "%z" "${table_file}") -lt ${size} ]]; do
    command mysql -u root --socket "${VTDATAROOT}/vt_0000000${commerce_primary_uid}/mysql.sock" vt_commerce -e "insert into customer (email) select concat(${i}, email) from customer limit 5000000"
    let i=i+1
done

say "Full data load completed"


# Setup the customer keyspace
./201_customer_tablets.sh


# Move the customer table from the commerce keyspace to the customer keyspace
customer_primary_uid=$(vtctldclient GetTablets --keyspace customer --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)

# Profile target primary in a different shell
go tool pprof -seconds 130 "http://localhost:15${customer_primary_uid}/debug/pprof/profile"
(pprof) top 20 -ignore_regex runtime -cum

# Immediately start the workflow
vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tablet-types primary --tables "customer"
date
wait_for_workflow_running
date

say "Workflow is running"

go tool pprof "http://localhost:15${customer_primary_uid}/debug/pprof/allocs"
(pprof) top 30


# Now see how long it takes to replicate 10,000,000 new rows
command mysql -u root --socket "${VTDATAROOT}/vt_0000000${commerce_primary_uid}/mysql.sock" vt_commerce -e "insert into customer (email) select concat(${i}, email) from customer limit 10000000"
sleep 1
date
while [[ $(vtctldclient Workflow --keyspace="customer" show --workflow="commerce2customer" 2>/dev/null | jq -r '.[] | .[0].max_v_replication_transaction_lag') -gt 1 ]]; do
    sleep 1
done
date


# Cancel the workflow
vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer cancel

Related Issue(s)

Checklist

Copy link
Contributor

vitess-bot bot commented Nov 7, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Nov 7, 2024
@github-actions github-actions bot added this to the v22.0.0 milestone Nov 7, 2024
Copy link

codecov bot commented Nov 9, 2024

Codecov Report

Attention: Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.

Project coverage is 67.41%. Comparing base (4980d6f) to head (5f15e3f).

Files with missing lines Patch % Lines
...blet/tabletmanager/vreplication/replicator_plan.go 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17166      +/-   ##
==========================================
+ Coverage   67.39%   67.41%   +0.02%     
==========================================
  Files        1570     1570              
  Lines      252892   252887       -5     
==========================================
+ Hits       170437   170486      +49     
+ Misses      82455    82401      -54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Nov 10, 2024
@@ -54,6 +54,7 @@ vttablet \
--service_map 'grpc-queryservice,grpc-tabletmanager,grpc-updatestream' \
--pid_file $VTDATAROOT/$tablet_dir/vttablet.pid \
--heartbeat_on_demand_duration=5s \
--pprof-http \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will be reverted before merging.

Copy link
Contributor Author

@mattlord mattlord Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about it more, I'd like to keep this enabled for the examples. Seems like a good case for it. I can remove/revert the change though if others disagree. If we do keep it here, I could also add it for vtgate and vtctld.

@@ -171,7 +171,7 @@ func (vc *vdbClient) Execute(query string) (*sqltypes.Result, error) {
func (vc *vdbClient) ExecuteWithRetry(ctx context.Context, query string) (*sqltypes.Result, error) {
qr, err := vc.Execute(query)
for err != nil {
if sqlErr, ok := err.(*sqlerror.SQLError); ok && sqlErr.Number() == sqlerror.ERLockDeadlock || sqlErr.Number() == sqlerror.ERLockWaitTimeout {
if sqlErr, ok := err.(*sqlerror.SQLError); ok && (sqlErr.Number() == sqlerror.ERLockDeadlock || sqlErr.Number() == sqlerror.ERLockWaitTimeout) {
Copy link
Contributor Author

@mattlord mattlord Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated, but I noticed that it was not correct as I was doing testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a separate PR for this since it's worth a quick backport to v19 to prevent a panic if the error we got back from the query was NOT an SQL error.

@mattlord mattlord marked this pull request as ready for review November 10, 2024 06:03
Signed-off-by: Matt Lord <[email protected]>
@@ -353,6 +353,10 @@ message FieldEvent {
repeated query.Field fields = 2;
string keyspace = 3;
string shard = 4;

// Field numbers in the gap between shard (4) and enum_set_string_values
Copy link
Contributor Author

@mattlord mattlord Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in this file are unrelated to this PR but I wanted to get them into main after the accidental field number gap came up in a discussion.

@mattlord
Copy link
Contributor Author

mattlord commented Nov 11, 2024

Putting back in Draft to investigate test failures that seemingly came out of nowhere....

@mattlord mattlord marked this pull request as draft November 11, 2024 08:21
@mattlord
Copy link
Contributor Author

mattlord commented Nov 11, 2024

Putting back in Draft to investigate test failures that seemingly came out of nowhere....

Addressed via f0f61db and 21564fd

This impacted e2e tests that were not previously using VPlayerBatching.

The load generator constantly generates INSERTs, which are then
effeciently batched in vplayer so we get ~ 7x more throughput than
before and thus we need more time for filtered replication to catch
up after we've stopped it for the vdiff. ESPECIALLY since we're
using the --update-table-stats flag and the ANALYZE TABLE and its
locking causes a pause in updates to the table the load generator is
inserting into -- in particular for the test clusters that only
have PRIMARY tablets as everything is interacting directly.

Signed-off-by: Matt Lord <[email protected]>
@@ -35,7 +35,7 @@ import (
)

const (
vdiffTimeout = 120 * time.Second // We can leverage auto retry on error with this longer-than-usual timeout
vdiffTimeout = 180 * time.Second // We can leverage auto retry on error with this longer-than-usual timeout
Copy link
Contributor Author

@mattlord mattlord Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see the reason for the vdiff test changes in the commit message: f0f61db

It failed for some materializations

Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord marked this pull request as ready for review November 11, 2024 23:06
Signed-off-by: Matt Lord <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant