Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement native support StringView for overlay #11968

Merged
merged 4 commits into from
Aug 15, 2024
Merged

Conversation

PsiACE
Copy link
Member

@PsiACE PsiACE commented Aug 13, 2024

Which issue does this PR close?

Closes #11909 .

Rationale for this change

Ref: #11942

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions labels Aug 13, 2024
PsiACE added 3 commits August 14, 2024 00:51
Signed-off-by: Chojan Shang <[email protected]>
Signed-off-by: Chojan Shang <[email protected]>
@@ -726,7 +726,7 @@ EXPLAIN SELECT
FROM test;
----
logical_plan
01)Projection: overlay(CAST(test.column1_utf8view AS Utf8), Utf8("foo"), Int64(2)) AS c1
01)Projection: overlay(test.column1_utf8view, Utf8View("foo"), Int64(2)) AS c1
02)--TableScan: test projection=[column1_utf8view]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, can we please also add actually running these queries to the tests

like

query 
SELECT OVERLAY(column1_utf8view PLACING 'foo' FROM 2 ) as c1

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 907e27e

Signed-off-by: Chojan Shang <[email protected]>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @PsiACE

DataType::LargeUtf8 => make_scalar_function(overlay::<i64>, vec![])(args),
other => exec_err!("Unsupported data type {other:?} for function overlay"),
}
}
}

macro_rules! process_overlay {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some sort of trait in arrow-rs that allowed us to write this as a generic function

It actually does have
https://github.com/apache/arrow-rs/blob/2461a16c19ee5032531b1c05dd7e7192bc842e0f/arrow-string/src/like.rs#L158-L161

But that is not public

@XiangpengHao do you know of anything that is pub?

We could also implement such a trait for DataFusion's convenience, and then propose upstreaming it 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just behind -- it seems that @Omega359 did exactly this in StringArrayType #11941

trait StringArrayType<'a>: ArrayAccessor<Item = &'a str> + Sized {
fn iter(&self) -> ArrayIter<Self>;
}
impl<'a, T: OffsetSizeTrait> StringArrayType<'a> for &'a GenericStringArray<T> {
fn iter(&self) -> ArrayIter<Self> {
GenericStringArray::<T>::iter(self)
}
}
impl<'a> StringArrayType<'a> for &'a StringViewArray {
fn iter(&self) -> ArrayIter<Self> {
StringViewArray::iter(self)
}
}

Maybe we can start to pull that trait into its own module and start reusing it across the string functions 🤔

Also, there is the ArrayAccessor pattern used elegantly by @devanbenz in #11967 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current style is inspired by rpad in #11942 . I'll be rewriting it with ArrayAccessor, which I've used in other PRs, and it's a much more elegant way.

@alamb alamb merged commit ea2e7ab into apache:main Aug 15, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Thanks @PsiACE

@PsiACE PsiACE deleted the overlay branch August 15, 2024 11:23
samuelcolvin pushed a commit to pydantic/datafusion that referenced this pull request Aug 15, 2024
* Implement native support StringView for overlay

Signed-off-by: Chojan Shang <[email protected]>

* Re-write impl of overlay

Signed-off-by: Chojan Shang <[email protected]>

* Minor update

Signed-off-by: Chojan Shang <[email protected]>

* Add more tests

Signed-off-by: Chojan Shang <[email protected]>

---------

Signed-off-by: Chojan Shang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update OVERLAY scalar function to support Utf8View
2 participants