Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Try to convert a static list into a set in Rust #184

Merged
merged 3 commits into from
Mar 12, 2024

Conversation

advancedxy
Copy link
Contributor

Which issue does this PR close?

Closes #183

Rationale for this change

Leverage InSet optimization and improve performance.

What changes are included in this PR?

  1. use datafusion's in_list function to create an InListExpr
  2. optimize use group to make all the datafusion::physical_expr's items into a same group

How are these changes tested?

Existing tests.

@codecov-commenter
Copy link

codecov-commenter commented Mar 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 33.30%. Comparing base (488c523) to head (1e496c6).
Report is 3 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #184      +/-   ##
============================================
+ Coverage     33.29%   33.30%   +0.01%     
- Complexity      766      767       +1     
============================================
  Files           107      107              
  Lines         35385    35372      -13     
  Branches       7658     7657       -1     
============================================
+ Hits          11781    11782       +1     
+ Misses        21157    21144      -13     
+ Partials       2447     2446       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@advancedxy
Copy link
Contributor Author

I believe the failure is due to a BUG in Datafusion, which I have just filed one: apache/datafusion#9530

I will add a temporary fix here.

@advancedxy advancedxy closed this Mar 10, 2024
@advancedxy advancedxy reopened this Mar 10, 2024
@advancedxy
Copy link
Contributor Author

cc @viirya @sunchao

Ok(Arc::new(InListExpr::new(value, list, expr.negated, None)))
// in_list doesn't handle value being dictionary type correctly, so we need to fall
// back to InListExpr if in_list fails.
// TODO: remove the fallback when https://github.com/apache/arrow-datafusion/issues/9530 is fixed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can check if input_schema contains any dictionary type and use InListExpr if so?

Unfortunately dictionary type is treated as a logical type in DF, which caused many issues like this before. We created apache/datafusion#7421 but haven't got resources to work on it yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can check if input_schema contains any dictionary type and use InListExpr if so?

Sound good to me. I will address it in a following commit in probably ~8 hours as I'm going to bed now.

Unfortunately dictionary type is treated as a logical type in DF, which caused many issues like this before. We created apache/datafusion#7421 but haven't got resources to work on it yet.

yea.. I can imagine that a lot of issues/special handling of dictionary array would be introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@sunchao sunchao changed the title fix: Try to convert a static list into a set in rust fix: Try to convert a static list into a set in Rust Mar 12, 2024
@sunchao sunchao merged commit 5dc865d into apache:main Mar 12, 2024
20 checks passed
@sunchao
Copy link
Member

sunchao commented Mar 12, 2024

Merged, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Perform InSet optimization in the native side
4 participants