-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Fix shuffle with round robin partitioning fail #5928
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
cc @zhztheplayer @marin-ma thank you |
Thanks. Seems like this failure is because we creates a hash computation for all input columns. If there are any NullType in the input, those column types will be converted to UNKNOWN Type in Velox, but Velox doesn't support hash computing on UNKNOWN types. I would suggest we keep this check, meanwhile drop the NullType input columns in the hash computation as the null values doesn't affect the hash computation ( |
@marin-ma Is null type the only unsupported data type for hash expression ? If so, I can drop null column before going to hash expression. |
@marin-ma never mind.. I will drop null type column in this pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the fix!
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
What changes were proposed in this pull request?
We should validate the project before sort rather than using ProjectExecTransformer directly. The hash expression may not support offload to native, e.g., the child output contains null type or something else.
The following query would fail:
How was this patch tested?
add test