You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This altered behavior caused a regression in a two column Window UDF used in datafusion-python tests. See this comment detailing the error and this commit that fixes it.
Describe the solution you'd like
The default behavior should just return all the input expressions.
Describe alternatives you've considered
I didn't find any justification for only taking the first input_expr as the default behavior, so probably I'm missing something.
@jcsherin - please let me know if there's a reason that I'm missing for this behavior.
Additional context
No response
The text was updated successfully, but these errors were encountered:
In lead/lag user-defined window functions which accepts 1-3 arguments, the shift_offset (2nd argument) and default_value (3rd argument) are saved as fields of WindowShiftEvaluator struct which implements PartitionEvaluator.
// VecDeque contains offset values that between non-null entries
non_null_offsets:VecDeque<usize>,
}
The arguments are parsed once and cached when the partition evaluator executes. So this allows correct operation even though expressions() returns only the first input expression to the partition evaluator.
Now based on your example downstream I see that this API is overly restrictive. I agree that the default behavior needs to be where expressions() returns all the input expressions.
Apologies for the time you spent investigating this issue 🙏
Is your feature request related to a problem or challenge?
#12857 added the
expressions
method to theWindowUDFImpl
trait.The default impl currently only takes the first
input_expr
and discards the rest.datafusion/datafusion/expr/src/udwf.rs
Lines 314 to 319 in 444a673
This altered behavior caused a regression in a two column Window UDF used in
datafusion-python
tests. See this comment detailing the error and this commit that fixes it.Describe the solution you'd like
The default behavior should just return all the input expressions.
Describe alternatives you've considered
I didn't find any justification for only taking the first
input_expr
as the default behavior, so probably I'm missing something.@jcsherin - please let me know if there's a reason that I'm missing for this behavior.
Additional context
No response
The text was updated successfully, but these errors were encountered: