-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve monotonicity api #10117
improve monotonicity api #10117
Changes from all commits
fc0db97
349e12f
cbc08c1
b77eabb
d47b896
1266ed8
8379e7c
ea21acd
e5f72bb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -346,13 +346,60 @@ impl Signature { | |
} | ||
} | ||
|
||
/// Monotonicity of the `ScalarFunctionExpr` with respect to its arguments. | ||
/// Each element of this vector corresponds to an argument and indicates whether | ||
/// the function's behavior is monotonic, or non-monotonic/unknown for that argument, namely: | ||
/// - `None` signifies unknown monotonicity or non-monotonicity. | ||
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question. | ||
/// - Some(false) indicates that the function is monotonically decreasing w.r.t. the argument in question. | ||
pub type FuncMonotonicity = Vec<Option<bool>>; | ||
/// Monotonicity of a function with respect to its arguments. | ||
/// | ||
/// A function is [monotonic] if it preserves the relative order of its inputs. | ||
/// | ||
/// [monotonic]: https://en.wikipedia.org/wiki/Monotonic_function | ||
#[derive(Debug, Clone)] | ||
pub enum FuncMonotonicity { | ||
/// not monotonic or unknown monotonicity | ||
None, | ||
/// Increasing with respect to all of its arguments | ||
Increasing, | ||
/// Decreasing with respect to all of its arguments | ||
Decreasing, | ||
/// Each element of this vector corresponds to an argument and indicates whether | ||
/// the function's behavior is monotonic, or non-monotonic/unknown for that argument, namely: | ||
/// - `None` signifies unknown monotonicity or non-monotonicity. | ||
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question. | ||
/// - Some(false) indicates that the function is monotonically decreasing w.r.t. the argument in question. | ||
Mixed(Vec<Option<bool>>), | ||
} | ||
|
||
impl PartialEq for FuncMonotonicity { | ||
fn eq(&self, other: &Self) -> bool { | ||
match (self, other) { | ||
(FuncMonotonicity::None, FuncMonotonicity::None) => true, | ||
(FuncMonotonicity::Increasing, FuncMonotonicity::Increasing) => true, | ||
(FuncMonotonicity::Decreasing, FuncMonotonicity::Decreasing) => true, | ||
(FuncMonotonicity::Mixed(vec1), FuncMonotonicity::Mixed(vec2)) => { | ||
vec1 == vec2 | ||
} | ||
_ => false, | ||
} | ||
} | ||
} | ||
|
||
impl FuncMonotonicity { | ||
pub fn matches(&self, other: &Self) -> bool { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: The Also, I don't think we need the arity of the function anymore for comparison b/w For eg: |
||
match (self, other) { | ||
(FuncMonotonicity::None, FuncMonotonicity::Mixed(inner_vec)) | ||
| (FuncMonotonicity::Mixed(inner_vec), FuncMonotonicity::None) => { | ||
inner_vec.iter().all(|&x| x.is_none()) | ||
} | ||
(FuncMonotonicity::Increasing, FuncMonotonicity::Mixed(inner_vec)) | ||
| (FuncMonotonicity::Mixed(inner_vec), FuncMonotonicity::Increasing) => { | ||
inner_vec.iter().all(|&x| x == Some(true)) | ||
} | ||
(FuncMonotonicity::Decreasing, FuncMonotonicity::Mixed(inner_vec)) | ||
| (FuncMonotonicity::Mixed(inner_vec), FuncMonotonicity::Decreasing) => { | ||
inner_vec.iter().all(|&x| x == Some(false)) | ||
} | ||
_ => self == other, | ||
} | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -251,10 +251,23 @@ pub fn out_ordering( | |
func: &FuncMonotonicity, | ||
arg_orderings: &[SortProperties], | ||
) -> SortProperties { | ||
func.iter().zip(arg_orderings).fold( | ||
arg_orderings.iter().enumerate().fold( | ||
SortProperties::Singleton, | ||
|prev_sort, (item, arg)| { | ||
let current_sort = func_order_in_one_dimension(item, arg); | ||
|prev_sort, (index, arg)| { | ||
let arg_monotonicity: Option<bool> = match func { | ||
FuncMonotonicity::None => None, | ||
FuncMonotonicity::Increasing => Some(true), | ||
FuncMonotonicity::Decreasing => Some(false), | ||
FuncMonotonicity::Mixed(inner_vec) => { | ||
Comment on lines
+257
to
+261
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The earlier I do think that the conversion from If the conversion still feels confusing, we can leave a note here mentioning this for clarity:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree the conversion feels confusing --
I would be interested in trying to help here as I think the major benefit of this change is to encapsulate the monotonicity calculations. I'll give it a look There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we could add something like enum Monotonicity {
Increasing,
Decreasing,
Unknown,
} And then represent I would love to get a chance to pla around with it, but may not have time for a while |
||
if inner_vec.len() > index { | ||
inner_vec[index] | ||
} else { | ||
None | ||
} | ||
} | ||
}; | ||
|
||
let current_sort = func_order_in_one_dimension(&arg_monotonicity, arg); | ||
|
||
match (prev_sort, current_sort) { | ||
(_, SortProperties::Unordered) => SortProperties::Unordered, | ||
|
@@ -299,3 +312,39 @@ fn func_order_in_one_dimension( | |
} | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use arrow_schema::Schema; | ||
|
||
use datafusion_common::{DFSchema, Result}; | ||
use datafusion_expr::{FuncMonotonicity, ScalarUDF}; | ||
|
||
use crate::utils::tests::TestScalarUDF; | ||
use crate::ScalarFunctionExpr; | ||
|
||
use super::create_physical_expr; | ||
|
||
#[test] | ||
fn test_function_expr() -> Result<()> { | ||
let udf = ScalarUDF::from(TestScalarUDF::new()); | ||
|
||
let e = crate::expressions::lit(1.1); | ||
let p_expr = | ||
create_physical_expr(&udf, &[e], &Schema::empty(), &[], &DFSchema::empty())?; | ||
let expr_monotonicity = p_expr | ||
.as_any() | ||
.downcast_ref::<ScalarFunctionExpr>() | ||
.unwrap() | ||
.monotonicity(); | ||
|
||
assert_eq!(expr_monotonicity, &Some(FuncMonotonicity::Increasing)); | ||
|
||
assert!(expr_monotonicity | ||
.as_ref() | ||
.unwrap() | ||
.matches(&FuncMonotonicity::Mixed(vec![Some(true)]))); | ||
|
||
Ok(()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These public API's don't check the size of the inner vector. Giving a larger index would panic the code. We can wrap it with a result here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These APIs aren't being used anywhere. I think we can just remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
I've removed the unused
arg_increasing
&arg_decreasing
public APIs.For reference, the only place we're currently using FuncMonotonicity is this:
https://github.com/tinfoil-knight/arrow-datafusion/blob/53d9e30a7561c97492e47e3ee1679885b6c510e6/datafusion/physical-expr/src/scalar_function.rs#L250-L263