Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-only]Rocksdb wal debug #13

Open
wants to merge 1,149 commits into
base: master
Choose a base branch
from

Conversation

tonyxuqqi
Copy link

What is changed and how it works?

Issue Number: Close #xxx

What's Changed:

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM
  • Breaking backward compatibility

Release note

Please add a release note.

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with None.
If this PR will be picked to release branch, then a release note is probably required.

overvenus and others added 30 commits September 5, 2023 08:45
close tikv#15412

Similar to resolved-ts endpoint, cdc endpoint maintains resolvers for
subscribed regions. These resolvers also need memory quota, otherwise
they may cause OOM.
This commit lets cdc endpoint deregister regions if they exceed
memory quota.

Signed-off-by: Neil Shen <[email protected]>
ref tikv#15082

Add more logs and metrics for resolved-ts.

Signed-off-by: ekexium <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#15513

coprocessor: add SQL statement tracing in tikv slow log

Signed-off-by: Chao Wang <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#15412

MemoryQuota alloc API returns result, make it more ergonomic.

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#15409

supplement read track metrics

Signed-off-by: SpadeA-Tang <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…eted (tikv#15543)

close tikv#15534

fix chaos between on_memtable_sealed and on_flush_completed

Signed-off-by: SpadeA-Tang <[email protected]>
close tikv#15483

The rewrite step of sst_importer::apply has been delayed to while iterating the file.

Signed-off-by: hillium <[email protected]>

Co-authored-by: 3pointer <[email protected]>
ref tikv#15409

Supply extra test cases, including integration tests and unit tests for raftstore-v2 on `gc`.

Signed-off-by: lucasliang <[email protected]>
close tikv#15579

update cargo.lock

Signed-off-by: SpadeA-Tang <[email protected]>
close tikv#15565

Signed-off-by: lance6716 <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#15588

add option to update TiKV config without persist in status API  "POST /config?persist=false|true"

Signed-off-by: tonyxuqqi <[email protected]>
close tikv#15580

Enable force leader to rollback merges when they are not able to proceed, previously, only regions with quorum can do this.

Signed-off-by: Yang Zhang <[email protected]>

Co-authored-by: tonyxuqqi <[email protected]>
ref tikv#14864

* Break resolved ts scan entry into multiple tasks.
* Limit concurrent resolved ts scan tasks.
* Remove resolved ts dead code.

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#14654

get_all_regions_in_store should exclude tombstone

Signed-off-by: SpadeA-Tang <[email protected]>

Co-authored-by: tonyxuqqi <[email protected]>
ref tikv#15401

report async snapshot metrics to prometheus

Signed-off-by: SpadeA-Tang <[email protected]>
…eously (tikv#15625)

ref tikv#15242

fix rollback merge and commit merge can happen simultaneously

Signed-off-by: SpadeA-Tang <[email protected]>
close tikv#15582

add metrics for detail disk usage.

Signed-off-by: bufferflies <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#15553

The Resolver uses a hash set to keep track of locks associated with
the same timestamp. When the length of the hash set reaches zero,
it indicates that the transaction has been fully committed. To save
memory, we can replace the hash set with an integer.

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…ikv#15628)

ref tikv#15621

The security issue is google/flatbuffers#6627.
Upgrade flatbuffers from 2.1.2 to 23.5.26 to address it.

Signed-off-by: tonyxuqqi <[email protected]>
Signed-off-by: Qi Xu <[email protected]>

Co-authored-by: Qi Xu <[email protected]>
close tikv#15462

Signed-off-by: glorv <[email protected]>

Co-authored-by: tonyxuqqi <[email protected]>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#14320

support online change lock write buffer limit

Signed-off-by: SpadeA-Tang <[email protected]>
Connor1996 and others added 30 commits January 18, 2024 05:25
…ead of near seek (tikv#16131)

ref tikv#16245

Use write cf stats to decide load action for default cf instead of near seek

Signed-off-by: Connor1996 <[email protected]>
ref tikv#16323

Basic WriteBatch implementation for In-Memory Engine.

Signed-off-by: Alex Feinberg <[email protected]>

Co-authored-by: tonyxuqqi <[email protected]>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#16141

refactor from region based to range based

Signed-off-by: SpadeA-Tang <[email protected]>
close tikv#16410

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#16370

Change titan min blob size default value to 32KB

Signed-off-by: Connor1996 <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…node (tikv#16174)

close tikv#15799

Check the last heartbeat time before doing remove node operation. It defines 8*heartbeat interval as the threshold of slow peer. And if the remove node operation will lead to at least half of the peers are slow, then the remove node operation will fail.

Signed-off-by: Qi Xu <[email protected]>

Co-authored-by: Qi Xu <[email protected]>
close tikv#15414

This PR have refactored the subscription manager.
Generally, this:
- Replace the instance itself with a handle. This make it a real reactor(with an real event loop).
- Handle the result of subscripting a region via the message system instead of asynchronously, this will be the basis of making subscription tracker thread safe and (someday, hopefully) merge the basic libraries with TiCDC.

Based on the changes above, this PR also allows a region to be temporarily deregistered while we are about to reach the memory quota.

Signed-off-by: Yu Juncen <[email protected]>

Co-authored-by: Neil Shen <[email protected]>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…#16408)

close tikv#16368

This pull request addresses a corner case where `WakeUp` messages were being
ignored during I/O hang scenarios.

Signed-off-by: lucasliang <[email protected]>
…nels (tikv#16432)

ref tikv#15990

Refine the order of grafana dashboard to localize related panels

Signed-off-by: Connor1996 <[email protected]>
close tikv#16438

txn: Reserve lock data prefix `T` for future use

Signed-off-by: Ping Yu <[email protected]>
ref tikv#16234

* txn: refactor task into a module
* storage: refactor commands marco

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…ikv#16388)

close tikv#16382

Now, a newly established prepare disk snapshot backup stream will abort the former one.

Signed-off-by: Yu Juncen <[email protected]>
…16238)

ref tikv#16141

implement the garbage collection of the in-memory engine  -- backend part

Signed-off-by: Spade  A <[email protected]>

Co-authored-by: tongjian <[email protected]>
close tikv#16445

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#16463

Use the standard from and into traits.

Signed-off-by: hi-rustin <[email protected]>
close tikv#16465

improve the remove peer check. Only check when the updating role is voter

Signed-off-by: tonyxuqqi <[email protected]>
ref tikv#16323

Update WriteBatch to assume a single skiplist and use RangeManager::contains.
Implement and test `get_value_cf_opt` for `HybridEngineSnapshot`.
Integrate single WriteBatch with HybridEngine.

Signed-off-by: Alex Feinberg <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#16449

1. report the exec duration in grpc pool in every request
2. report the wait duration from other pool to grpc pool

Signed-off-by: bufferflies <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#16465

Improve readability

Signed-off-by: Yang Zhang <[email protected]>
tikv#16239)

ref tikv#15874

This pr inspect the gap of each peer's `applied_log_index` and `commit_log_index` when restarting.

And if the gap exceeds the `leader_transfer_max_log_lag`, the related peer will be marked
as `pending for recovery` state. After the gap is less than `leader_transfer_max_log_lag`,
it means that the pending logs is acceptable.

Only if the count of ready peers exceeds the given configuration, that is,
`min_recovery_ready_region_percent`, this store is ready for re-balancing leaders. Before
this stage, the state of this store will be marked `is_busy` to avoid transferring leaders to it.

Signed-off-by: lucasliang <[email protected]>
…Service from PdWorker to it (tikv#16456)

ref tikv#16297

Add module health_controller and move SlowScore, SlowTrend, HealthService from PdWorker to it

Signed-off-by: MyonKeminta <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ref tikv#16463

Moved the analyze connect struct out of the big analyze.rs file and used the enum to represent the analyze version.

Signed-off-by: hi-rustin <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
…v#16486)

close tikv#16465

When calculating the impact of conf change, include all operations into considerations.

Signed-off-by: tonyxuqqi <[email protected]>
Signed-off-by: tonyxuqqi <[email protected]>
Signed-off-by: tonyxuqqi <[email protected]>
Signed-off-by: tonyxuqqi <[email protected]>
Signed-off-by: Qi Xu <[email protected]>
Signed-off-by: Qi Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.