Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Support row type and fix subfield in filter push-down #6618

Merged
merged 1 commit into from
Jul 31, 2024

Conversation

rui-mo
Copy link
Contributor

@rui-mo rui-mo commented Jul 29, 2024

What changes were proposed in this pull request?

Supports the push-down of IsNull and IsNotNull filter for row type.
Filter on subfield cannot be pushed-down, and we put them in the remaining filter.

How was this patch tested?

With unit test.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@rui-mo rui-mo changed the title Support row type and fix subfield in filter push-down [VL] Support row type and fix subfield in filter push-down Jul 29, 2024
@rui-mo rui-mo force-pushed the wip_row branch 4 times, most recently from 484abf4 to 4484120 Compare July 30, 2024 01:47
@rui-mo rui-mo force-pushed the wip_row branch 2 times, most recently from c0cc5ac to d5b685a Compare July 30, 2024 04:06
Copy link
Contributor Author

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FelixYBW @PHILO-HE Could you help take a review? Thanks.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice! cc @FelixYBW

@zhli1142015
Copy link
Contributor

zhli1142015 commented Jul 30, 2024

One question about the lgoic here.
I thought there is already logic to split filters to subfield filters and remaining filters in Velox: https://github.com/facebookincubator/velox/blob/main/velox/connectors/hive/HiveConnectorUtil.cpp#L790
I wonder what's the reason we maintain similar logic here or what's the difference between them. Thanks.

@rui-mo
Copy link
Contributor Author

rui-mo commented Jul 31, 2024

One question about the lgoic here. I thought there is already logic to split filters to subfield filters and remaining filters in Velox: https://github.com/facebookincubator/velox/blob/main/velox/connectors/hive/HiveConnectorUtil.cpp#L790 I wonder what's the reason we maintain similar logic here or what's the difference between them. Thanks.

@zhli1142015 Thanks for bringing this up. We were using our implementation from the start since Velox added the function extractFiltersFromRemainingFilter after we implemented it in Gluten. One distinction I see is that Gluten eliminates the filters to be pushed down from the call expression, whereas Velox does not. Before taking use of the Velox implementation, we might need to verify Gluten supported cases and fix the met issue if there is. We can then seamlessly transition to the Velox implementation. @FelixYBW Do you have any suggestion? Thanks.

@FelixYBW
Copy link
Contributor

@zhli1142015 Thanks for bringing this up. We were using our implementation from the start since Velox added the function extractFiltersFromRemainingFilter after we implemented it in Gluten. One distinction I see is that Gluten eliminates the filters to be pushed down from the call expression, whereas Velox does not. Before taking use of the Velox implementation, we might need to verify Gluten supported cases and fix the met issue if there is. We can then seamlessly transition to the Velox implementation. @FelixYBW Do you have any suggestion? Thanks.

Compared to Velox's, Is there any logic we missed?

@FelixYBW
Copy link
Contributor

Thank you, Rui for your fix.

@rui-mo
Copy link
Contributor Author

rui-mo commented Jul 31, 2024

Compared to Velox's, Is there any logic we missed?

@FelixYBW I notice Velox supports the pushdown of between and neq, while Gluten does not. But a previous quick trial shows Velox's implementation could create subfield filters that are not supported, which leads to runtime exception. We might need to do a full validation before using it.

@rui-mo rui-mo merged commit df6fe90 into apache:main Jul 31, 2024
41 checks passed
@FelixYBW
Copy link
Contributor

@FelixYBW I notice Velox supports the pushdown of between and neq, while Gluten does not. But a previous quick trial shows Velox's implementation could create subfield filters that are not supported, which leads to runtime exception. We might need to do a full validation before using it.

Thank you, Rui. Can you create an enhancement issue to track?

@rui-mo
Copy link
Contributor Author

rui-mo commented Aug 1, 2024

@FelixYBW Opened an issue to track #6666. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants