Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove physical expr of NamedStructField, convert to get_field function call #9563

Merged
merged 18 commits into from
Mar 13, 2024

Conversation

yyy1000
Copy link
Contributor

@yyy1000 yyy1000 commented Mar 11, 2024

Which issue does this PR close?

Closes #9532 .

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate labels Mar 11, 2024
@yyy1000 yyy1000 marked this pull request as draft March 12, 2024 01:19
@yyy1000 yyy1000 marked this pull request as ready for review March 12, 2024 16:08
Self::new()
}
}

impl ScalarUDFImpl for StructFunc {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last PR I make StructFunc pub so I add the Default trait, but since #9546 (comment) it will not be needed.

args,
),
)));
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the method from #9546 (comment), but it seems that here it can't get the context_provider?

Copy link
Contributor

@jayzhan211 jayzhan211 Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only sql crate can get it directly, or any other crate that has context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could implement this rewrite more easily using the API proposed in #9583 (so you could use datafusion_functions directly)

}

let arr;
let array = match &args[0] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe try values_to_arrays?

impl GetFieldFunc {
pub fn new() -> Self {
Self {
signature: Signature::any(2, Volatility::Immutable),
Copy link
Contributor

@jayzhan211 jayzhan211 Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems the signature should be struct and utf8. But, we can set it to Any and utf8 for now.

Copy link
Contributor Author

@yyy1000 yyy1000 Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can also be Map and utf8.
A question for me is how to specify the Any as the param in Signature?
I tried Signature::exact(vec![Any, DataType::Utf8], Volatility::Immutable) but it said

expected value, found trait Any not a value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it as any(2) for now. I forgot that it is better to introduce signature only if we need type coercion.

@@ -45,6 +45,7 @@ async-trait = { workspace = true }
chrono = { workspace = true }
datafusion-common = { workspace = true, default-features = true }
datafusion-expr = { workspace = true }
datafusion-functions = { workspace = true }
Copy link
Contributor

@alamb alamb Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is very important to not add this dependency -- we are triyng to make the core not know about the functions

I think #9583 will let us avoid it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I can wait on it and use that function.

@yyy1000 yyy1000 changed the title Remove physical expr of NamedStructField, convert to struct function Remove physical expr of NamedStructField, convert to get_field function Mar 13, 2024
@alamb
Copy link
Contributor

alamb commented Mar 13, 2024

Hi @yyy1000 -- I have merged #9583 -- is there any chance you can rebase this PR and try to use that feature here?

@yyy1000
Copy link
Contributor Author

yyy1000 commented Mar 13, 2024

Hi @yyy1000 -- I have merged #9583 -- is there any chance you can rebase this PR and try to use that feature here?

Yeah, I will try to use this feature now.

@github-actions github-actions bot removed the optimizer Optimizer rules label Mar 13, 2024
@yyy1000
Copy link
Contributor Author

yyy1000 commented Mar 13, 2024

Hi @alamb
I applied the feature and it looks good.
Could you review it again when you are available?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @yyy1000 for this contribution as well as for making it so easy to review these PRs (with annotations explaining potentially non obvious changes)

This is a wonderful change / simplification

@@ -222,12 +225,10 @@ fn create_physical_name(e: &Expr, is_first_expr: bool) -> Result<String> {
stride: _,
} => {
unreachable!(
"ListIndex should have been rewritten in OperatorToFunction"
"ListRange should have been rewritten in OperatorToFunction"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb alamb changed the title Remove physical expr of NamedStructField, convert to get_field function Remove physical expr of NamedStructField, convert to get_field function call Mar 13, 2024
@alamb alamb merged commit 3b61004 into apache:main Mar 13, 2024
25 checks passed
@yyy1000 yyy1000 deleted the replace-NamedStructField branch March 13, 2024 18:15
@yyy1000
Copy link
Contributor Author

yyy1000 commented Mar 13, 2024

Thank you @alamb @jayzhan211 for your reviews and guidance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace GetFieldAccessExpr::NamedStructField with ScalarFunction::Struct
3 participants