-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tablets 4.x #241
Tablets 4.x #241
Conversation
@Bouncheck can you please describe the current status of it? Also, if review is already required or not and if any blockers exist at the moment. |
Currently I've made almost complete switch to the solution that uses custom payloads instead of querying system.tablets. |
6425a4f
to
6669f85
Compare
6669f85
to
c6730c6
Compare
c24d841
to
8d5575b
Compare
@Bouncheck can you please update what’s the status and what’s this PR waiting for? |
Currently it's waiting for review. I've done some manual testing, but haven't done any stress testing under heavy load. |
Thanks @Bouncheck, can you please check the failing CI check? Regarding stress testing, I don't know if we have a tool/app that utilizes this driver version, IIRC there was a problem to build c-s with this driver version. However, if it's not the case anymore or we can utilize newer c-s from the upstream that we can build with this driver, we can have a longevity test using SCT. |
Will do. At first i thought 2024.1.2 failures were a one off, since it looked liked there's a lot of similar test errors relating to starting the cluster (and only for that scylla version), but it seems they reproduce after all. |
@Bouncheck what's the status of this PR? @Lorak-mmk can you please help reviewing this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments. Some things that fit better here:
- Why are there no tests for tablets? This is a complex feature so it should have proper tests (both unit tests of the new data structure and integration tests that check that tablet routing works properly in various scenarios)
- I see you decided to store replicas
Uuid
s instead ofNode
objects. This has a performance cost, because you need to allocate a newHashSet
of all replocaNode
objects for each query. I assume it was done to simplify the code, because you don't need to handle the case where we can't map some id toNode
, like I do in maintenance procedures in Rust Driver Tablets PR (Introduce support for Tablets scylla-rust-driver#937). But the maintenance is still required for cases like node being replaces (lack of handling of this case is another problem with this PR), so handling the missing-UUID case doesn't really add much complexity. - The PR is one being commit. It should be split into small atomic commits to make it easier to analyze fopr reviewers and future readers.
core/src/main/java/com/datastax/oss/driver/api/core/cql/PreparedStatement.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/api/core/metadata/TabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlRequestHandler.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlRequestHandler.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/cql/CqlRequestHandler.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/pool/ChannelPool.java
Outdated
Show resolved
Hide resolved
ping |
Yes, I've seen them and I'm reworking those parts. Should be ready soon |
I'll push the code likely today - I want to run 1 more concurrency experiment. |
8ccda18
to
17b1f40
Compare
@Bouncheck is it ready for re-review? |
17b1f40
to
da118b2
Compare
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/api/core/metadata/KeyspaceTableNamePair.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/api/core/metadata/NodeShardPair.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/internal/core/metadata/DefaultTabletMap.java
Outdated
Show resolved
Hide resolved
...on-tests/src/test/java/com/datastax/oss/driver/core/metadata/DefaultMetadataTabletMapIT.java
Show resolved
Hide resolved
core/src/main/java/com/datastax/oss/driver/api/core/cql/PreparedStatement.java
Outdated
Show resolved
Hide resolved
.../main/java/com/datastax/oss/driver/internal/core/loadbalancing/BasicLoadBalancingPolicy.java
Outdated
Show resolved
Hide resolved
@dani-tweig - this important PR is not tracked anywhere - there's no 'tablets' label for it, etc. |
da118b2
to
cabda4d
Compare
Introduces basic tablets support for version 4.x of the driver. Metadata about tablets will be kept in TabletMap that gets continuously updated through the tablets-routing-v1 extension. Each time the BoundStatement targets the wrong node and shard combination the server supporting tablets should respond with tablet metadata inside custom payload of its response. This metadata will be transparently processed and used for future queries. Tablets metadata will by enabled by default. Until now driver was using TokenMaps to choose replicas and appropriate shards. Having a token was enough information to do that. Now driver will first attempt tablet-based lookup and only after failing to find corresponding tablet it will defer to TokenMap lookup. Since to find a correct tablet besides the token we need the keyspace and table names, many of the methods were extended to also accept those as parameters. RequestHandlerTestHarness was adjusted to mock also MetadataManager. Before it used to mock only `session.getMetadata()` call but the same can be obtained by `context.getMetadataManager().getMetadata()`. Using the second method was causing test failures.
cabda4d
to
e0fbffd
Compare
Pushed another version:
|
I see CI is failing, do you know why? Also, I remember you mentioned that Tablet tests were not being executed, because of some edge case with version comparing. Is hat fixed? |
Last time I checked this test these same two methods ( The problem with tablets integration test was that the CI used to pull release candidate version for which I've noticed it was skipping this test. It's pulling 6.0.0 now which is the minimal required version and if you search the logs for |
(Copied over from commit message)
Introduces basic tablets support for version 4.x of the driver.
Metadata about tablets will be kept in TabletMap that gets continuously updated
through the tablets-routing-v1 extension. Each time the BoundStatement targets
the wrong node and shard combination the server supporting tablets should
respond with tablet metadata inside custom payload of its response.
This metadata will be transparently processed and used for future queries.
Tablets metadata will by enabled by default. Until now driver was using
TokenMaps to choose replicas and appropriate shards. Having a token was enough
information to do that. Now driver will first attempt tablet-based lookup
and only after failing to find corresponding tablet it will defer to TokenMap
lookup. Since to find a correct tablet besides the token we need the keyspace
and table names, many of the methods were extended to also accept those
as parameters.
RequestHandlerTestHarness was adjusted to mock also MetadataManager.
Before it used to mock only
session.getMetadata()
call but the same canbe obtained by
context.getMetadataManager().getMetadata()
. Using thesecond method was causing test failures.
Fixes #268