-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Series.all?/1
and Series.any?/1
#754
Conversation
@costaraphael Thanks this is looking good! RE the approaches: I think consistency with Also, I think we want to avoid as much as possible doing work on the Elixir side. It's significantly more costly. |
100%! When I said "on the Explorer side", I meant in the Rust code. Something like: #[rustler::nif(schedule = "DirtyCpu")]
pub fn s_all(s: ExSeries) -> Result<bool, ExplorerError> {
let s = s.clone_inner();
Ok(s.bool()?.fill_null_with_values(false)?.all())
} |
Yeah, I can definitely see that. This is also how PostgreSQL deals with Do you think it makes sense to support all the modes through an option, and default to |
Ah I'm just looking at the Polars implementations now:
They both provide an
That also implies not supporting your |
That's true for the expressions, but the two different approaches are behind different functions in the eager case, which is why I figured it wouldn't be that much more trouble to support a third option. But I see your point, a client can always do My only concern is that the I'll push a commit later with the changes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me and it is a reasonable default. We can add the proposed options in the future if requested. :)
Co-authored-by: José Valim <[email protected]>
Co-authored-by: José Valim <[email protected]>
💚 💙 💜 💛 ❤️ |
This PR introduces two new aggregations:
Series.all?/1
andSeries.any?/1
.The main thing to keep in mind with these functions is how to deal with
nil
. I could see three approaches:nil
values (all?([nil, nil, nil]) == all?([])
)nil
as unknown, and returnnil
whenever it makes the result also unknown (any?([nil, true]) == true
andall?([nil, false]) == false
, butany?([nil, false]) == nil
andall?([nil, true]) == nil
)Series.and/2
andSeries.or/2
functions behavenil
asfalse
just like in theEnum
functions (all?([nil, true]) == false
)I went with the first option here as it seems to be the default behavior in Polars, but I can see the appeal for either of them. The one thing to consider about the third option above is that it doesn't have native support in Polars, so it would need to be implemented on the Explorer side (should be straightforward with a call to
fill_missing(false)
before aggregating).Another route I can see is supporting all behaviors, and letting the caller decide through an option:
Let me know what your thoughts are!