Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CH] Got Exception: The order of aggregation result columns is invalid #8142

Closed
lgbo-ustc opened this issue Dec 4, 2024 · 4 comments · Fixed by #8164
Closed

[CH] Got Exception: The order of aggregation result columns is invalid #8142

lgbo-ustc opened this issue Dec 4, 2024 · 4 comments · Fixed by #8164
Labels
bug Something isn't working triage

Comments

@lgbo-ustc
Copy link
Contributor

Backend

CH (ClickHouse)

Bug description

Job aborted due to stage failure: Task 0 in stage 323.0 failed 2 times, most recent failure: Lost task 0.1 in stage 323.0 (TID 46601) (sg-dn3538.bigdata.bigo.inner executor 2054): org.apache.gluten.exception.GlutenException: The order of aggregation result columns is invalid
0. ../contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000014a82559
1. ./build_new/../src/Common/Exception.cpp:109: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x00000000069dfc39
2. ../src/Common/Exception.h:111: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000689598c
3. ../src/Common/Exception.h:129: DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x00000000068879eb
4. ./build_new/../utils/extern-local-engine/Parser/RelParsers/AggregateRelParser.cpp:98: local_engine::AggregateRelParser::parse(std::unique_ptr<DB::QueryPlan, std::default_delete<DB::QueryPlan>>, substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x0000000006df0284
5. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:277: local_engine::SerializedPlanParser::parseOp(substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x0000000006dac92e
6. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:212: local_engine::SerializedPlanParser::parse(substrait::Plan const&) @ 0x0000000006dabe6f
7. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:226: local_engine::SerializedPlanParser::createExecutor(substrait::Plan const&) @ 0x0000000006dad30f
8. ./build_new/../utils/extern-local-engine/local_engine_jni.cpp:270: 

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

@lgbo-ustc lgbo-ustc added bug Something isn't working triage labels Dec 4, 2024
@lgbo-ustc
Copy link
Contributor Author

The grouping keys are

coalesce(col_0,col_15,right_3032.col_0),
coalesce(col_1,col_16,right_3032.col_1),
coalesce(col_2,col_17,right_3032.col_2),
coalesce(col_3,col_18,all_3033),
coalesce(col_4,col_19,right_3032.col_3),
coalesce(col_5,right_3032.col_4,all_3034),
coalesce(col_20,0_3035),
coalesce(col_21,0_3044),
coalesce(col_6,0_3036),
coalesce(col_7,0_3037),
coalesce(col_22,0_3045),
coalesce(col_24,0_3038),
coalesce(col_13,0_3039),
coalesce(col_23,0_3046),
coalesce(sparkDivide(CAST(coalesce(col_13,0_3051),Float64_3052),CAST(coalesce(col_6,0_3053),Float64_3054)),0_3055),
coalesce(col_8,0_3047),
coalesce(col_9,0_3048),
coalesce(col_10,0_3040),
coalesce(col_14,0_3041),
coalesce(sparkDivide(CAST(coalesce(col_14,0_3056),Float64_3057),CAST(coalesce(col_10,0_3058),Float64_3059)),0_3060),
coalesce(col_11,0_3049),
coalesce(col_12,0_3050),
coalesce(right_3032.col_5,0_3042),
coalesce(right_3032.col_6,0_3043),
coalesce(sparkDivide(CAST(coalesce(right_3032.col_6,0_3061),Float64_3062),CAST(coalesce(right_3032.col_5,0_3063),Float64_3064)),0_3065),
coalesce(col_0,col_15,right_3032.col_0)

There are duplicated coalesce(col_0,col_15,right_3032.col_0) in the grouping keys and header.

@lgbo-ustc
Copy link
Contributor Author

Does not following distinct work?

CHHashAggregateExecTransformer(
requiredChildDistributionExpressions,
groupingExpressions.distinct,
aggregateExpressions,
aggregateAttributes,
initialInputBufferOffset,
replacedResultExpressions.distinct,
child

@lgbo-ustc
Copy link
Contributor Author

lgbo-ustc commented Dec 4, 2024

some related PRs, #7368 #7101, #5619

@lgbo-ustc
Copy link
Contributor Author

lgbo-ustc commented Dec 5, 2024

problems

1

0: jdbc:hive2://localhost:10000> explain select days, rtime, uid, owner, day1 from(select day1 as days, rtime, uid, owner, day1 from (select distinct coalesce(day, "today") as day1, rtime, uid, owner from test_7096 where day = '2024-09-01')) group by days, rtime, uid, owner, day1;
+----------------------------------------------------+
|                        plan                        |
+----------------------------------------------------+
| == Physical Plan ==
CHNativeColumnarToRow
+- ^(2) HashAggregateTransformer(keys=[day1#0, rtime#8, uid#9, owner#10], functions=[], isStreamingAgg=false)
   +- ^(2) InputIteratorTransformer[day1#0, rtime#8, uid#9, owner#10]
      +- ColumnarExchange hashpartitioning(day1#0, rtime#8, uid#9, owner#10, day1#0, 5), ENSURE_REQUIREMENTS, [plan_id=155], [shuffle_writer_type=hash], [OUTPUT] ArrayBuffer(day1:StringType, rtime:IntegerType, uid:StringType, owner:StringType)
         +- ^(1) HashAggregateTransformer(keys=[day1#0, rtime#8, uid#9, owner#10], functions=[], isStreamingAgg=false)
            +- ^(1) ProjectExecTransformer [coalesce(day#7, today) AS day1#0, rtime#8, uid#9, owner#10]
               +- ^(1) FilterExecTransformer (isnotnull(day#7) AND (day#7 = 2024-09-01))
                  +- ^(1) NativeScan hive default.test_7096 [day#7, owner#10, rtime#8, uid#9], HiveTableRelation [`default`.`test_7096`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [day#7, rtime#8, uid#9, owner#10], Partition Cols: []]

 |
+----------------------------------------------------+

If we make columns are unique in grouping keys and aggregate results. There is mismatch between aggregate result and output.

2

If we remove distinct in grouping keys and aggregate results, above query also fail

Caused by: org.apache.gluten.exception.GlutenException: Missmatch result columns size. plan column size 6, subtrait plan output schema size 5, subtrait plan name size 5.
2024-12-05 10:49:26.655 <Error> SerializedPlanParser: clickhouse plan(0) =>
Expression (Rename Output)
Header: day1#13 String
        rtime#16 Nullable(Int32)
        uid#17 Nullable(String)
        owner#18 Nullable(String)
        day1#13 String
        coalesce(day,today_1) String
Actions: INPUT : 0 -> coalesce(day,today_1) String : 0
         INPUT : 1 -> rtime Nullable(Int32) : 1
         INPUT : 2 -> uid Nullable(String) : 2
         INPUT : 3 -> owner Nullable(String) : 3
         ALIAS coalesce(day,today_1) : 0 -> day1#13 String : 4
         ALIAS coalesce(day,today_1) :: 0 -> day1#13 String : 5
         ALIAS rtime :: 1 -> rtime#16 Nullable(Int32) : 0
         ALIAS uid :: 2 -> uid#17 Nullable(String) : 1
         ALIAS owner :: 3 -> owner#18 Nullable(String) : 2
Positions: 4 0 1 2 5
  StreamingAggregating
  Header: coalesce(day,today_1) String
          rtime Nullable(Int32)
          uid Nullable(String)
          owner Nullable(String)
          coalesce(day,today_1) String
  Keys: coalesce(day,today_1), rtime, uid, owner, coalesce(day,today_1)
    Expression (Project)
    Header: coalesce(day,today_1) String
            rtime Nullable(Int32)
            uid Nullable(String)
            owner Nullable(String)
    Actions: INPUT : 0 -> day String : 0
             INPUT :: 1 -> owner Nullable(String) : 1
             INPUT :: 2 -> rtime Nullable(Int32) : 2
             INPUT :: 3 -> uid Nullable(String) : 3
             COLUMN Const(String) -> today_1 String : 4
             FUNCTION coalesce(day :: 0, today_1 :: 4) -> coalesce(day,today_1) String : 5
             ALIAS coalesce(day,today_1) :: 5 -> coalesce(day,today_1) String : 4
    Positions: 4 2 3 1

This is caused by ActionsDAG::updateHeader. If the input header contains duplicated columns, its result will different from outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant