-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-5852] [CH] fix mismatch result columns size exception #5853
Conversation
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
@ulysses-you @liujiayi771 please help to review, thanks. |
@shuai-xu I currently don't have the env for CH, so I used Velox to test the case you provided. The result is consistent with vanilla Spark. Does this issue only occur in CH? Could you tell me what CH returns before the fix? I also tried debugging and did not enter into the code you modified. In addition, the original logic here is to reuse some duplicated pulled-out |
Run Gluten Clickhouse CI |
This case only happens in CH. It will throw an exception as shown in the issue before fix. The problem here is that the resue logic consider two different attibutes with same name as same one, so later in CH, it can't know whether they are really same. |
The literal should not happen in group keys. Can we add It seems we can not remove it since there is an alias wrap literal.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
07c1af3
Run Gluten Clickhouse CI |
2 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
val preProject = ProjectExec( | ||
eliminateProjectList(agg.child.outputSet, expressionMap.values.toSeq), | ||
agg.child) | ||
// ISSUE-5852: literals with same names are lost in expressionMap, need addback to Project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the modification of the logic in this code unnecessary in your earliest version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the literals is not in the project outputs, it will case fallback.
gluten-core/src/main/scala/org/apache/gluten/utils/PullOutProjectHelper.scala
Outdated
Show resolved
Hide resolved
Run Gluten Clickhouse CI |
LGTM. cc @ulysses-you Do you have any other suggestions? |
gluten-core/src/main/scala/org/apache/gluten/utils/PullOutProjectHelper.scala
Outdated
Show resolved
Hide resolved
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
What changes were proposed in this pull request?
For sql whose agg columns contains two same const value, they will be transfromed to same name, In #5619 , it called distinct on agg's keys and outputs, so the const columns may be removed.
(Fixes: #5852)
How was this patch tested?
This patch was tested by unit tests.