-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eq
for struct
#5423
eq
for struct
#5423
Conversation
Signed-off-by: jayzhan211 <[email protected]>
arrow-ord/src/cmp.rs
Outdated
let col_l = l.column(i); | ||
let col_r = r.column(i); | ||
let eq_rows = compare_op(op, col_l, col_r)?; | ||
for (j, item) in res.iter_mut().enumerate().take(len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be faster to compute the bitwise and of the underlying buffers, than to do this element wise
Something like
let nulls = NullBuffer::union(l.nulls(), r.nulls());
let vals = l.values () & r.values();
BooleanArray::new(vals, nulls)
In combination with Iterator::fold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clippy told me to use try_fold instead, but it is so difficult to work with try_fold.
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
arrow-ord/src/cmp.rs
Outdated
} | ||
return Ok(res); | ||
return (0..l.num_columns()).fold( | ||
Ok(BooleanArray::from(vec![true; len])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok(BooleanArray::from(vec![true; len])), | |
Ok(BooleanArray::new(BooleanBuffer::new_set(len), None)), |
This will be significantly faster
Signed-off-by: jayzhan211 <[email protected]>
arrow-ord/src/cmp.rs
Outdated
return (0..l.num_columns()).fold( | ||
Ok(BooleanArray::new(BooleanBuffer::new_set(len), None)), | ||
|res, i| { | ||
let res = res?; | ||
let col_l = l.column(i); | ||
let col_r = r.column(i); | ||
let eq_rows = compare_op(op, col_l, col_r)?; | ||
|
||
let nulls = NullBuffer::union(res.nulls(), eq_rows.nulls()); | ||
let vals = res.values() & eq_rows.values(); | ||
Ok(BooleanArray::new(vals, nulls)) | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (0..l.num_columns()).fold( | |
Ok(BooleanArray::new(BooleanBuffer::new_set(len), None)), | |
|res, i| { | |
let res = res?; | |
let col_l = l.column(i); | |
let col_r = r.column(i); | |
let eq_rows = compare_op(op, col_l, col_r)?; | |
let nulls = NullBuffer::union(res.nulls(), eq_rows.nulls()); | |
let vals = res.values() & eq_rows.values(); | |
Ok(BooleanArray::new(vals, nulls)) | |
}, | |
) | |
return (0..l.num_columns()).try_fold( | |
BooleanArray::new(BooleanBuffer::new_set(len), None), | |
|res, i| { | |
let col_l = l.column(i); | |
let col_r = r.column(i); | |
let eq_rows = compare_op(op, col_l, col_r)?; | |
let nulls = NullBuffer::union(res.nulls(), eq_rows.nulls()); | |
let vals = res.values() & eq_rows.values(); | |
Ok(BooleanArray::new(vals, nulls)) | |
}, | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I did not return with Result Ok(BooleanArray::new(vals, nulls))
, but BooleanArray::new(vals, nulls))
Signed-off-by: jayzhan211 <[email protected]>
See #5408 (comment) I think we should probably start by adding support to DynComparator. In the case of StructArray there is likely still merit to having a specialized impl as in this PR as it will be significantly faster, but I think we will always need the fallback of DynComparator in order to support ListArray and friends. |
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
Ref #5411
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?