-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support Utf8View
for get_wider_type
+ binary_to_string_coercion
functions
#13370
base: main
Are you sure you want to change the base?
Conversation
Same with this, I would expect test that failed if the change is reverted |
@@ -1550,6 +1552,62 @@ mod tests { | |||
); | |||
} | |||
|
|||
#[test] | |||
fn test_get_wider_type_with_utf8view() { | |||
assert_eq!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not this kind of test, but a valid sql query that goes through the coercion function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayzhan211 Correct me if I'm wrong, but this seems to be a problem with arrow casting from numerics to utf8view, the same can be said for this, #13366. However correct me if I'm wrong, I can't seem to find a sql query that triggers it, although i believe there should be one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems only like_coercion
utilize this coercion. I think current coercion is unnecessary complex for like
operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean binary_to_string_coercion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the one bug I noted this looks fine to me. I do think however there should be a ticket filed to look at the get_wider_type function and see if it can be removed altogether as there seems to be just one use case and this may (I haven't checked but I feel it's likely) duplicate a function in arrow-cast.
I am trying to clean up the review backlog -- what is the status of this PR? It seems like it is waiting on an end to end (e.g. sql or dataframe test) that actually has a problem / doesn't run without this PR but does run with it I played around with it and I think I came up with one: > create or replace table foo as select arrow_cast('1', 'Utf8View') as c;
0 row(s) fetched.
Elapsed 0.002 seconds.
> select c from foo;
+---+
| c |
+---+
| 1 |
+---+
1 row(s) fetched.
Elapsed 0.001 seconds.
> select c = 1 from foo;
Error during planning: Cannot infer common argument type for comparison operation Utf8View = Int64 But it works for strings: > create or replace table foo as select arrow_cast('1', 'Utf8') as c;
0 row(s) fetched.
Elapsed 0.002 seconds.
> select c = 1 from foo;
+------------------+
| foo.c = Int64(1) |
+------------------+
| true |
+------------------+
1 row(s) fetched.
Elapsed 0.002 seconds. Looks like we need to test with Also since the argument type order is important, also in both directions Does that make sense @jonathanc-n ? Thank you for your help |
@alamb I also believe that this one is also waiting on numeric to utf8view coercion pr in arrow-rs, correct me if I'm wrong though. |
We just updated to a version of arrow-rs that I think has the necessary changes: |
Which issue does this PR close?
Closes #13361
Closes #13360.
Rationale for this change
What changes are included in this PR?
Added string view to functions
Are these changes tested?
Are there any user-facing changes?