-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve monotonicity api #10117
improve monotonicity api #10117
Conversation
The problem I see with this representation is that I think we can find a refactor that increases readability but doesn't introduce such traps and preserves uniqueness. Maybe a struct with an inner |
This comment was marked as outdated.
This comment was marked as outdated.
I think this is an excellent point
Another potential solution that would avoid allocating Vec so much would be to make the enum private and only allow construction through APIs Maybe something like // Non pub to ensure can only construct valid representations
pub enum FuncMonotonicityInner {
None,
Increasing,
Decreasing,
Mixed(Vec<Option<bool>>),
}
/// Describes function monotonicity
pub struct FuncMonotonicity(FuncMonotonicityInner);
impl FuncMontonicity {
pub fn new_none() -> Self {..}
pub fn new_increasing() -> Self { .. }
pub fn new_decreasing() -> Self { .. }
pub fn new_mixed(inner: Vec<bool>) -> { .. }
/// returns true if this function is monotonically increasing with respect to argument number arg
pub fn arg_increasing(&self, arg: usize) -> bool { .. }
/// returns true if this function is monotonically decreasing with respect to argument number arg
pub fn arg_decreasing(&self, arg: usize) -> bool { .. }
} In my opinion the real value of a struct over a typedef is that it reduces the cognative load of remembering "what does |
I like your new proposal. In this one, Instead of |
@tinfoil-knight -- are you interested in completing this PR with the proposed API updates? |
@alamb Yeah. I just have some college exams this week so might get delayed a bit but I'll pick this up as soon as possible. |
No worries -- good luck with the exams! |
Converting to draft as it is not waiting on feedback |
datafusion/expr/src/signature.rs
Outdated
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question. | ||
/// - Some(false) indicates that the function is monotonically decreasing w.r.t. the argument in question. | ||
pub type FuncMonotonicity = Vec<Option<bool>>; | ||
#[derive(Debug, Clone, PartialEq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tinfoil-knight. I can just say this PartialEq
derivation could be dangerous here. One can expect to see the equality of Increasing and vec![Some(true)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Makes sense. I've fixed this in d47b896
(#10117)
FuncMonotonicityPartial::Increasing => false, | ||
FuncMonotonicityPartial::Decreasing => true, | ||
FuncMonotonicityPartial::Mixed(inner) => inner[arg].unwrap_or(false), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These public API's don't check the size of the inner vector. Giving a larger index would panic the code. We can wrap it with a result here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These APIs aren't being used anywhere. I think we can just remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
I've removed the unused arg_increasing
& arg_decreasing
public APIs.
For reference, the only place we're currently using FuncMonotonicity is this:
53d9e30
to
8379e7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left my comments for things to fix/improve. Thanks @tinfoil-knight
datafusion/expr/src/signature.rs
Outdated
impl From<&FuncMonotonicity> for Vec<Option<bool>> { | ||
fn from(val: &FuncMonotonicity) -> Self { | ||
match &val.0 { | ||
FuncMonotonicityPartial::None => vec![None], | ||
FuncMonotonicityPartial::Increasing => vec![Some(true)], | ||
FuncMonotonicityPartial::Decreasing => vec![Some(false)], | ||
FuncMonotonicityPartial::Mixed(inner) => inner.to_vec(), | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this conversion is possible to do. Given only Increasing
, you can not produce the vector without knowing the arity of the function. Assuming an arity of one (which seems to be the case here) can lead to subtle bugs and/or compile-time errors in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @ozankabak it is confusing to have a way to convert back to Vec<Option<bool>>
-- I think it would be easier to understand if all comparisons are done directly on FuncMonotonicity
rather than converting to Vec<Option>
datafusion/expr/src/signature.rs
Outdated
|
||
impl PartialEq for FuncMonotonicity { | ||
fn eq(&self, other: &Self) -> bool { | ||
Into::<Vec<Option<bool>>>::into(self) == Into::<Vec<Option<bool>>>::into(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given my comment on From
, you will need explicit logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this perhaps:
match (self, other) {
(Self::None, None) => true,
(Self::Increasing, Self::Increasing) => true,
(Self::Decreasing, Self::Decreasing) => true,
(Self::Partial(self_inner), Self::Partial(other_inner)) => self_inner == other_inner,
_ => false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We also need to do configs like (Self::Partial(inner), Self::Increasing)
but this is the idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that type of comparsion belongs in something other than equal
-- perhaps
impl FuncMonotonicity {
fn matches(&self, &other) -> bool {
...
}
}
So we can separate the equality check from the both sides match 🤔
let monotonicity_vec: Vec<Option<bool>> = func.into(); | ||
|
||
monotonicity_vec.iter().zip(arg_orderings).fold( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic will unfortunately get a little complicated given the issue with From
.
datafusion/expr/src/signature.rs
Outdated
#[derive(Debug, Clone)] | ||
enum FuncMonotonicityPartial { | ||
/// not monotonic or unknown monotonicity | ||
None, | ||
/// Increasing with respect to all of its arguments | ||
Increasing, | ||
/// Decreasing with respect to all of its arguments | ||
Decreasing, | ||
/// Each element of this vector corresponds to an argument and indicates whether | ||
/// the function's behavior is monotonic, or non-monotonic/unknown for that argument, namely: | ||
/// - `None` signifies unknown monotonicity or non-monotonicity. | ||
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question. | ||
/// - Some(false) indicates that the function is monotonically decreasing w.r.t. the argument in question. | ||
Mixed(Vec<Option<bool>>), | ||
} | ||
|
||
/// Monotonicity of a function with respect to its arguments. | ||
/// | ||
/// A function is [monotonic] if it preserves the relative order of its inputs. | ||
/// | ||
/// [monotonic]: https://en.wikipedia.org/wiki/Monotonic_function | ||
#[derive(Debug, Clone)] | ||
pub struct FuncMonotonicity(FuncMonotonicityPartial); | ||
|
||
impl FuncMonotonicity { | ||
pub fn new_none() -> Self { | ||
Self(FuncMonotonicityPartial::None) | ||
} | ||
pub fn new_increasing() -> Self { | ||
Self(FuncMonotonicityPartial::Increasing) | ||
} | ||
pub fn new_decreasing() -> Self { | ||
Self(FuncMonotonicityPartial::Decreasing) | ||
} | ||
pub fn new_mixed(inner: Vec<Option<bool>>) -> Self { | ||
Self(FuncMonotonicityPartial::Mixed(inner)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use just one data type? Consider:
/// Monotonicity of a function with respect to its arguments.
///
/// A function is [monotonic] if it preserves the relative order of its inputs.
///
/// [monotonic]: https://en.wikipedia.org/wiki/Monotonic_function
#[derive(Debug, Clone)]
pub enum FuncMonotonicity {
/// Not monotonic or unknown monotonicity.
None,
/// Increasing with respect to all of its arguments.
Increasing,
/// Decreasing with respect to all of its arguments.
Decreasing,
/// Each element of the vector corresponds to an argument and indicates whether
/// the function's behavior is monotonic or non-monotonic/unknown for that argument, namely:
/// - `None` signifies unknown monotonicity or non-monotonicity.
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question.
/// - `Some(false)` indicates that the function is monotonically decreasing w.r.t. the argument in question.
Partial(Vec<Option<bool>>),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tinfoil-knight and @ozankabak -- the PR is making progress (and it is a good example of iterative API design). Thank you for sticking with it
datafusion/expr/src/signature.rs
Outdated
impl From<&FuncMonotonicity> for Vec<Option<bool>> { | ||
fn from(val: &FuncMonotonicity) -> Self { | ||
match &val.0 { | ||
FuncMonotonicityPartial::None => vec![None], | ||
FuncMonotonicityPartial::Increasing => vec![Some(true)], | ||
FuncMonotonicityPartial::Decreasing => vec![Some(false)], | ||
FuncMonotonicityPartial::Mixed(inner) => inner.to_vec(), | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @ozankabak it is confusing to have a way to convert back to Vec<Option<bool>>
-- I think it would be easier to understand if all comparisons are done directly on FuncMonotonicity
rather than converting to Vec<Option>
datafusion/expr/src/signature.rs
Outdated
|
||
impl PartialEq for FuncMonotonicity { | ||
fn eq(&self, other: &Self) -> bool { | ||
Into::<Vec<Option<bool>>>::into(self) == Into::<Vec<Option<bool>>>::into(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this perhaps:
match (self, other) {
(Self::None, None) => true,
(Self::Increasing, Self::Increasing) => true,
(Self::Decreasing, Self::Decreasing) => true,
(Self::Partial(self_inner), Self::Partial(other_inner)) => self_inner == other_inner,
_ => false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments to clarify things.
let arg_monotonicity: Option<bool> = match func { | ||
FuncMonotonicity::None => None, | ||
FuncMonotonicity::Increasing => Some(true), | ||
FuncMonotonicity::Decreasing => Some(false), | ||
FuncMonotonicity::Mixed(inner_vec) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The earlier impl From<&FuncMonotonicity> for Vec<Option<bool>>
was confusing & incorrect so it has been removed now.
I do think that the conversion from FuncMonotonicity
to Option<bool>
is required in this function.
I tried a few different ways (without conversion) but it was resulting in really messy code.
If the conversion still feels confusing, we can leave a note here mentioning this for clarity:
/// - `None` signifies unknown monotonicity or non-monotonicity.
/// - `Some(true)` indicates that the function is monotonically increasing w.r.t. the argument in question.
/// - `Some(false)` indicates that the function is monotonically decreasing w.r.t. the argument in question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the conversion feels confusing --
I do think that the conversion from FuncMonotonicity to Option is required in this function.
I tried a few different ways (without conversion) but it was resulting in really messy code.
I would be interested in trying to help here as I think the major benefit of this change is to encapsulate the monotonicity calculations. I'll give it a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could add something like
enum Monotonicity {
Increasing,
Decreasing,
Unknown,
}
And then represent FuncMonotonicity::Mixed
with a Vec<Monotonicity>
I would love to get a chance to pla around with it, but may not have time for a while
} | ||
|
||
impl FuncMonotonicity { | ||
pub fn matches(&self, other: &Self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The matches
method isn't currently being used anywhere in the codebase. It has been added to make comparisons easier for any future purposes.
Also, I don't think we need the arity of the function anymore for comparison b/w FuncMonotonicity::Mixed
and other types.
For eg: FuncMonotonicity::Increasing
is "Increasing with respect to all of its arguments" which is the same as the FuncMonotonicity::Mixed
's inner vector being Some(true)
for all elements from what I understand.
Thanks @tinfoil-knight -- I will check this out later today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tinfoil-knight
let arg_monotonicity: Option<bool> = match func { | ||
FuncMonotonicity::None => None, | ||
FuncMonotonicity::Increasing => Some(true), | ||
FuncMonotonicity::Decreasing => Some(false), | ||
FuncMonotonicity::Mixed(inner_vec) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the conversion feels confusing --
I do think that the conversion from FuncMonotonicity to Option is required in this function.
I tried a few different ways (without conversion) but it was resulting in really messy code.
I would be interested in trying to help here as I think the major benefit of this change is to encapsulate the monotonicity calculations. I'll give it a look
let arg_monotonicity: Option<bool> = match func { | ||
FuncMonotonicity::None => None, | ||
FuncMonotonicity::Increasing => Some(true), | ||
FuncMonotonicity::Decreasing => Some(false), | ||
FuncMonotonicity::Mixed(inner_vec) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could add something like
enum Monotonicity {
Increasing,
Decreasing,
Unknown,
}
And then represent FuncMonotonicity::Mixed
with a Vec<Monotonicity>
I would love to get a chance to pla around with it, but may not have time for a while
I believe this PR has been superceded by #10504 which removed the montonicity apu in favor of a more expressive bounds analysis, so closing this pR Thanks for the work on this @tinfoil-knight , and sorry we went another way. BTW @berkaysynnada in the future if you are working on a PR that might make another become obselete it might be nice to drop a note on the other PR to given the author(s) a heads up. |
Sorry for the inconvenience, @tinfoil-knight. My main motivation for #10504 was not specifically related to the monotonicity API, but I had to make changes in those areas to achieve a better overall design. I apologize for not informing you earlier. |
No worries @berkaysynnada and @alamb. It's difficult to keep track of everyone's work in such a large project. It was fun to work on this and I learnt a few things from everyone's review. |
Thanks everyone for the communication. I just want everyone working on this project to have a good experience -- I am glad to hear that it was. Thanks again @tinfoil-knight and @berkaysynnada |
Which issue does this PR close?
Closes #9879 .
Rationale for this change
The
Vec<Option<bool>>
type used to express the monotonicity of scalar functions wrt their arguments was really difficult to understand.Note: Currently only mathematical and date-time functions are considered for monotonicity.
What changes are included in this PR?
FuncMonotonicity
has been changed from the typeVec<Option<bool>>
to an enum.Are these changes tested?
By existing tests.
Are there any user-facing changes?
No.