-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support RightAnti
for SortMergeJoin
#13680
Conversation
@comphead - are you able to review this? |
select 41 a, 42 b union all | ||
select 41 a, 45 b | ||
) | ||
select t2.* from t1 right anti join t2 on t1.a = t2.a and t1.b = t2.b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting thing: right ANTI/SEMI naturally not supported by SQL text, however DF supports it, DuckDB/PG will fail on query like that because of parse error.
DF runs it and picks the join type properly
| plan_type | plan |
+---------------+---------------------------------------------------------------------------------------------+
| logical_plan | Sort: t2.a ASC NULLS LAST, t2.b ASC NULLS LAST |
| | RightAnti Join: t1.a = t2.a, t1.b = t2.b |
From one side it is really convenient to test out queries with RIGHT ANTI/SEMI using SQL, from another it is not very expected behavior comparing to other engines.
WDYT should we do something with that? @alamb @Dandandan @ozankabak @jayzhan-synnada @findepi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DuckDB/PG failed because of not supported, in this case, I think it is fine that we have different result because we support the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to support the syntax, although it's not common for other engines to support it.
@@ -2910,6 +2984,310 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
#[tokio::test] | |||
async fn join_right_anti_one() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what one means here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning is "a column." This naming has already been updated.
@@ -1920,6 +1939,10 @@ impl SortMergeJoinStream { | |||
let output_column_indices = (0..left_columns_length).collect::<Vec<_>>(); | |||
filtered_record_batch = | |||
filtered_record_batch.project(&output_column_indices)?; | |||
} else if matches!(self.join_type, JoinType::RightAnti) { | |||
let output_column_indices = (0..right_columns_length).collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can factor out this part into separate method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @irenjj I think it is very very close
Thanks @comphead . I've made the changes based on your suggestions. Could you please take another look? |
@@ -2910,6 +2992,310 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
#[tokio::test] | |||
async fn join_right_anti_one_one() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @irenjj sorry I still not very sure what is one_one?
is it join by one column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @irenjj
Which issue does this PR close?
Closes #13472
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?