-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Extend the simple UDAF interface with function-level variables #12067
base: main
Are you sure you want to change the base?
feat: Extend the simple UDAF interface with function-level variables #12067
Conversation
✅ Deploy Preview for meta-velox canceled.
|
9013ee9
to
9775661
Compare
@kagamiori Some time ago, we attempted to add function-level states to the simple UDAF interface in #9167 and we have discussed some details in #8711, but @mbasmanova suggested that we implement it in a manner similar to the simple scalar functions interface. You had previously provided a specific implementation plan, and this PR is implemented according to the plan you proposed at that time. However, I did not add support for constantInput, as we currently only need to use rawInputType and resultType. Adding constantInput would require introducing new variables in the AggregateFunctionFactory, and I believe we can expand it when we actually need it. We currently need this capability to implement the decimal average aggregate function in Spark. Could you help to take a look. cc @rui-mo. |
4ad7ba7
to
94af18b
Compare
94af18b
to
8d84155
Compare
Sure, let me take a look. This was the second prototype after getting @mbasmanova's suggestions: #9775. It looks like this PR has the same design 👍. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @liujiayi771, this looks mostly good to me except a few nits. Thank you for extending the simple UDAF interface to improve its usability!
velox/exec/SimpleAggregateAdapter.h
Outdated
TypePtr resultType) | ||
: Aggregate(std::move(resultType)), fn_{std::make_unique<FUNC>()} { | ||
if constexpr (support_initialize_) { | ||
FUNC::initialize(fn_.get(), step, argTypes, resultType_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Just fn_->initialize(step, argTypes, resultType_)
;
velox/exec/SimpleAggregateAdapter.h
Outdated
@@ -103,6 +110,7 @@ class SimpleAggregateAdapter : public Aggregate { | |||
// These functions are called on groups of both non-null and null | |||
// accumulators. These functions also return a bool indicating whether the | |||
// current group should be a NULL in the result vector. | |||
std::unique_ptr<FUNC> fn_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Move the member variable to the bottom of the class.
using IntermediateType = Row<int64_t, double>; | ||
using OutputType = double; | ||
|
||
TypePtr inputType_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Add a comment above to say explicitly that these are the function level variable we're going to test, and explain that they are set during the creation of the aggregation function and checked in addInput().
resultType, | ||
name, | ||
valueType); | ||
valueType, | ||
step, | ||
argTypes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: reorder the arguments to be name, step, argType, resultType, valueType
to be more consistent with the lambda's parameter list.
const TypePtr& inputType, | ||
const TypePtr& sumType, | ||
const TypePtr& resultType) { | ||
const TypePtr& resultType, | ||
core::AggregationNode::Step step, | ||
const std::vector<TypePtr>& argTypes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Reorder the parameter list to be step, argTypes, resultType, inputType, sumType
.
TypePtr inputType; | ||
TypePtr resultType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inputType -> inputType_, resultType -> resultType_
Some aggregate functions implemented based on the simple UDAF interface need to obtain information such as step, argTypes, and resultType in the AccumulatorType struct. This PR adds the capability for user-defined variables to the simple UDAF interface, supporting the passing of step, argTypes, and resultType to AccumulatorType struct when registering the SimpleAggregateAdapter.