Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concat simplifier for Utf8View types #13346

Merged
merged 5 commits into from
Nov 15, 2024

Conversation

timsaucer
Copy link
Contributor

Which issue does this PR close?

This addresses one item of #13330

Rationale for this change

Concat simplifier was not correctly combining the return types, so Utf8View was instead getting returned as Utf8, causing a schema failure.

What changes are included in this PR?

When combining schema in simplifier, follows the same rules to set the return type.

Are these changes tested?

Tested in unit tests and with tests in datafusion-python that identified the problem.
Added additional checks in current unit test.

Are there any user-facing changes?

None

@timsaucer timsaucer force-pushed the bugfix/string_view_concat branch from 6c9f13e to 8bad384 Compare November 11, 2024 13:18
@alamb alamb changed the title Bugfix/string view concat Fix concat simplifier for Utf8View types Nov 13, 2024
@@ -282,15 +293,34 @@ pub fn simplify_concat(args: Vec<Expr>) -> Result<ExprSimplifyResult> {
let mut new_args = Vec::with_capacity(args.len());
let mut contiguous_scalar = "".to_string();

let mut merged_type = DataType::Utf8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than re-computing the merged type, could you just call

fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {

And use that type?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @timsaucer -- I have a minor suggestion about how to clean up the code but I don't think it is necessary for merging this PR

@timsaucer
Copy link
Contributor Author

I’ll apply the suggestion, but out of pocket for a few days. Likely Friday

@alamb alamb marked this pull request as draft November 13, 2024 17:01
@alamb
Copy link
Contributor

alamb commented Nov 13, 2024

Converting to a draft so I don't accidentally merge this PR by accident :)

@timsaucer timsaucer marked this pull request as ready for review November 15, 2024 19:14
@timsaucer
Copy link
Contributor Author

@alamb this is ready for re-review

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @timsaucer

@alamb alamb merged commit 1e96a0a into apache:main Nov 15, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants