-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: DataFrame
supports unnesting multiple columns
#10118
Conversation
|
||
/// Test unnesting a non-nullable list. | ||
#[tokio::test] | ||
async fn unnest_non_nullable_list() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test suggested by @jayzhan211 in #10044 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
"| 3 | | | a |", | ||
"+------+------------+------------+--------+", | ||
]; | ||
assert_batches_sorted_eq!(expected, &results); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the result is deterministic, so we can check without sorting
Then, we can have a more straightforward result
[
"+------+------------+------------+--------+",
"| list | large_list | fixed_list | string |",
"+------+------------+------------+--------+",
"| 1 | | | a |",
"| 2 | 1.1 | | a |",
"| 3 | | | a |",
"| | 2.2 | 1 | b |",
"| | 3.3 | 2 | b |",
"| | 4.4 | | b |",
"| | | 3 | c |",
"| | | 4 | c |",
"| | | | d |",
"+------+------------+------------+--------+",
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the result is deterministic, so we can check without sorting
Updated. Thank you @jayzhan211 . It looks much better now.
"| 3 | | | a |", | ||
"+------+------------+------------+--------+", | ||
]; | ||
assert_batches_sorted_eq!(expected, &results); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Nice! Thanks @jonahgao |
Which issue does this PR close?
N/A
Rationale for this change
#10044 has enabled SQL to support unnesting multiple columns, this PR adds the same functionality to
DataFrame
.What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?
Yes
Mark the
unnest_column
API as deprecated, and add a new API calledunnest_columns
.