-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update INITCAP scalar function to support Utf8View #11888
Conversation
.collect::<GenericStringArray<T>>(); | ||
|
||
Ok(Arc::new(result) as ArrayRef) | ||
} | ||
|
||
fn initcap_utf8view<T: OffsetSizeTrait>(args: &[ArrayRef]) -> Result<ArrayRef> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am considering it might be a bit heavy to make this a macro
, we wouldn't have a lot of initcap_* like this.
let result = string_view_array | ||
.iter() | ||
.map(initcap_string) | ||
.collect::<GenericStringArray<T>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type is a StringArray
instead of a StringViewArray
, should I alter this behavior? In previous it was defined here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current utf8_to_str_type
only return Utf8
or LargeUtf8
. I think ideally we should support returning Utf8View
. But since we are recreating the strings anyway, I'm not sure if StringView will help here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @xinlifoobar and @XiangpengHao
|
||
fn initcap_string(string: Option<&str>) -> Option<String> { | ||
string.map(|string: &str| { | ||
let mut char_vector = Vec::<char>::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect you could make this faster by creating the vector once and then resetting on each loop -- like
let mut char_vector = Vec::<char>::new();
string.map(|string: &str| {
char_vector.clear();
...
}
Thanks again @xinlifoobar and @XiangpengHao |
Which issue does this PR close?
Closes #11853 and part of #11790
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?