Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-6768][CH] Try to reorder hash join tables based on AQE statistics #6770

Merged
merged 2 commits into from
Aug 14, 2024

Conversation

lgbo-ustc
Copy link
Contributor

@lgbo-ustc lgbo-ustc commented Aug 9, 2024

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

Fixes: #6768

If the gap of two tables‘ sizes is too large, try to reorder the tables, make sure the smaller table is used to build the hash table.

With AQE, we could get the row number from ShuffleQueryStageExec

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

unit tests

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Copy link

github-actions bot commented Aug 9, 2024

#6768

Copy link

github-actions bot commented Aug 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Aug 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Aug 9, 2024

Run Gluten Clickhouse CI

@lgbo-ustc lgbo-ustc changed the title [GLUTEN-6768][CH] Try to reorder hash join tables [GLUTEN-6768][CH] Try to reorder hash join tables based on AQE statistics Aug 9, 2024
Copy link

github-actions bot commented Aug 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Aug 9, 2024

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@@ -531,5 +531,33 @@ class GlutenClickHouseTPCHSuite extends GlutenClickHouseTPCHAbstractSuite {
spark.sql("drop table t1")
spark.sql("drop table t2")
}

test("GLUTEN-6768 rerorder hash join") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe check whether the join order is right?

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zzcclp zzcclp merged commit fc7f9cd into apache:main Aug 14, 2024
6 checks passed
sharkdtu pushed a commit to sharkdtu/gluten that referenced this pull request Nov 11, 2024
…tics (apache#6770)

[CH] Try to reorder hash join tables based on AQE statistics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] A bad case for joining with mixed join conditions
2 participants