Question about existence join semantics #3555

marin-ma · 2022-12-20T09:02:15Z

marin-ma
Dec 20, 2022

I'm debugging Spark's UT with Gluten + Velox, and found a query has different semantics from Spark.
The query is select * from l where not exists (select * from r where l.a = r.c and l.b < r.d) or not exists (select * from r where l.a = r.c), and all columns of input tables contain null.
In Gluten, the final execution plan is like:

== Physical Plan ==
GlutenColumnarToRowExec
+- *(107) ProjectExecTransformer [_1#220 AS a#225, _2#221 AS b#226]
   +- *(107) GlutenFilterExecTransformer (NOT exists#703 OR NOT exists#704)
      +- *(107) GlutenBroadcastHashJoinExecTransformer [_1#220], [c#236], ExistenceJoin(exists#704), BuildRight, false
         :- *(107) GlutenBroadcastHashJoinExecTransformer [_1#220], [c#236], ExistenceJoin(exists#703), BuildRight, (_2#221 < d#237), false
         :  :- GlutenRowToArrowColumnar
         :  :  +- LocalTableScan [_1#220, _2#221]
         :  +- ColumnarBroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=4382]
         :     +- *(105) ProjectExecTransformer [_1#231 AS c#236, _2#232 AS d#237]
         :        +- GlutenRowToArrowColumnar
         :           +- LocalTableScan [_1#231, _2#232]
         +- ColumnarBroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=4388]
            +- *(106) ProjectExecTransformer [_1#699 AS c#236]
               +- GlutenRowToArrowColumnar
                  +- LocalTableScan [_1#699, _2#700]

I built this UT in HashJoinTest.cpp to reproduce this query, but the result of PlanNode execution is empty, so it fails.

TEST_F(HashJoinTest, subquery) {
  auto l = makeRowVector(
      {"a", "b"},
      {
          makeNullableFlatVector<int32_t>({1, 1, 2, 2, 3, std::nullopt, std::nullopt, 6}),
          makeNullableFlatVector<double>({2.0, 2.0, 1.0, 1.0, 3.0, std::nullopt, 5.0, std::nullopt}),
      });

  auto r = makeRowVector(
      {"c", "d"},
      {
          makeNullableFlatVector<int32_t>({2, 2, 3, 4, std::nullopt, std::nullopt, 6}),
          makeNullableFlatVector<double>({3.0, 3.0, 2.0, 1.0, std::nullopt, 5.0, std::nullopt}),
      });

  createDuckDbTable("l", {l});
  createDuckDbTable("r", {r});

  // Plan two existence joins with a filter.
  auto planNodeIdGenerator = std::make_shared<core::PlanNodeIdGenerator>();
  auto plan = PlanBuilder(planNodeIdGenerator)
                  .values({l})
                  .hashJoin(
                      {"a"},
                      {"c"},
                      PlanBuilder(planNodeIdGenerator).values({r}).planNode(),
                      "b < d",
                      {"a", "b", "match"},
                      core::JoinType::kLeftSemiProject)
                  .project({"a", "b", "match as m1"})
                  .hashJoin(
                      {"a"},
                      {"c"},
                      PlanBuilder(planNodeIdGenerator).values({r}).planNode(),
                      "",
                      {"a", "b", "m1", "match"},
                      core::JoinType::kLeftSemiProject)
                  .filter("not m1 or not match")
                  .project({"a", "b"})
                  .planNode();

  HashJoinBuilder(*pool_, duckDbQueryRunner_, driverExecutor_.get())
      .planNode(std::move(plan))
      .referenceQuery(
          "select * from l where not exists (select * from r where l.a = r.c and l.b < r.d) or not exists (select * from r where l.a = r.c)")
      .checkSpillStats(false)
      .run();
}

The reason is that "match" column will return null if probe side join key is null, or build side contains null keys, and results in the last filter("not m1 or not match") output empty. Extra conditions need to be added to pass this test, for example change the last filter into filter("not m1 or not match or m1 is null or match is null").

There's a fix to return false for the empty build side #3275. Why non-empty build side use different semantic?

Answered by mbasmanova

Jan 5, 2023

@marin-ma @rui-mo Happy New Year! PR #3599 adds support for null-aware semi join project with filter. Would you try it out to see whether it works for your use cases?

View full answer

marin-ma · 2022-12-20T09:02:50Z

marin-ma
Dec 20, 2022
Author

@mbasmanova Could you take a look?

0 replies

marin-ma · 2022-12-20T09:03:30Z

marin-ma
Dec 20, 2022
Author

CC: @rui-mo

0 replies

mbasmanova · 2022-12-20T16:32:32Z

mbasmanova
Dec 20, 2022
Collaborator

@rui-mo Rui, this might be a duplicate of #3343. CC: @Yuhta

0 replies

mbasmanova · 2022-12-20T23:06:25Z

mbasmanova
Dec 20, 2022
Collaborator

The logic in HashProbe::fillOutput for semi join project w/ filter is incorrect. If build side it not empty, current logic emits NULL for all rows where probe-side join key is null. Instead, it needs to apply the filter and return NULL if at least one row passes the filter and return false if no row passes the filter. Similarly, when there is no match and build side has rows with null join keys, we need to evaluate the filter to null rows and return NULL if at least one row passes and return false if no rows pass. This logic is similar what we do for anti join. We need to refactor and unify semi join project and anti join code paths to avoid code duplication.

void HashProbe::fillOutput(vector_size_t size) {
...
  if (isLeftSemiProjectJoin(joinType_)) {
    // Populate 'match' column.
    if (emptyBuildSide()) {
      // Build side is empty. All rows should return 'match = false', even ones
      // with a null join key.
      matchColumn() = BaseVector::createConstant(false, size, pool());
    } else {
      auto flatMatch = matchColumn()->as<FlatVector<bool>>();
      flatMatch->resize(size);
      auto rawValues = flatMatch->mutableRawValues<uint64_t>();
      for (auto i = 0; i < size; ++i) {
        if (!nonNullInputRows_.isValid(i)) {
          flatMatch->setNull(i, true);
        } else {
          bool hasMatch = outputTableRows_[i] != nullptr;
          if (!hasMatch && buildSideHasNullKeys_) {
            flatMatch->setNull(i, true);
          } else {
            bits::setBit(rawValues, i, hasMatch);
          }
        }
      }
    }
  }

0 replies

mbasmanova · 2022-12-21T00:08:27Z

mbasmanova
Dec 21, 2022
Collaborator

The semantics of IN and EXISTS subqueries are different, hence, we need 2 types of semi join project: null-aware (IN) and regular (EXISTS).

Null-aware (IN) left semi join project w/o extra filter:

SELECT * FROM t where t0 IN (SELECT u0 FROM u)

null in the probe key: match is FALSE if build side is empty; match is NULL otherwise
non-null in the probe key: match is TRUE if build has matching row; match is NULL if build size has no matching row, but has rows with NULL join keys; match is FALSE otherwise.

Null-aware (IN) left semi join project w/ extra filter:

SELECT * FROM t where t0 IN (SELECT u0 FROM u WHERE <filter that uses columns from both t and u>)

null in the probe key: match is FALSE if build side is empty or no row passes the extra filter; match is NULL otherwise
non-null in the probe key: match is TRUE if build has matching row and extra filter passes; match is NULL if build side has no matching row, but has rows with NULL join keys and at least one of these rows passes the extra filter; match is FALSE otherwise.

Regular (EXISTS) left semi join project w/o extra filter:

SELECT * FROM t where EXISTS (SELECT * FROM u WHERE u0 = t0)

null in the probe key: match is FALSE
non-null in the probe keys: match is TRUE if build has matching row, match is FALSE otherwise

Regular (EXISTS) left semi join project w/ extra filter:

SELECT * FROM t where EXISTS (SELECT * FROM u WHERE u0 = t0 AND <filter that uses columns from both t and u>)

null in the probe key: match is FALSE
non-null in the probe keys: match is TRUE if build has matching row and extra filter passes, match is FALSE otherwise

The value of 'match' column in the output of regular semi join is either TRUE or FALSE, but never NULL. The value of 'match' column in the output of null-aware semi join can be TRUE, FALSE, or NULL.

When evaluating regular semi join, build side can ignore rows with null join keys.

LeftSemiProjectJoin(a, b) = RightSemiProjectJoin(b, a)
LeftSemiFilterJoin(a, b) = LeftSemiProjectJoin(a, b) -> Filter(match)
AntiJoin(a, b) = LeftSemiProjectJoin(a, b) -> Filter(NOT match)
NullAwareAntiJoin(a, b) = LeftNullAwareSemiProjectJoin(a, b) -> Filter(NOT match)

1 reply

rui-mo Dec 21, 2022
Collaborator

Thank you Masha, this explanation looks pretty clear. Can we find it at some place such as docs or comments?

marin-ma · 2022-12-21T05:01:38Z

marin-ma
Dec 21, 2022
Author

Thank you @mbasmanova, based on your explanation, I think the semantic of Spark's Existence join should match Velox's regular semi-join. We simply added a projection to map null to false in Gluten as workaround, and can be removed when regular semi-join is supported in Velox.

I also encountered another unmatched SQL, which should be AntiJoin w/ filter. Is this also the same bug mentioned in #3343 that need to be fixed?
SQL: select * from l where a not in (select c from r where d = b + 10.0)
Although there's a not in clause, the join key is d = b + 10.0 but not a and c. So it's planned into AntiJoin.

== Physical Plan ==
GlutenColumnarToRowExec
+- *(6) ProjectExecTransformer [_1#220 AS a#225, _2#221 AS b#226]
   +- *(6) GlutenBroadcastHashJoinExecTransformer [knownfloatingpointnormalized(normalizenanandzero((_2#221 + 10.0)))], [knownfloatingpointnormalized(normalizenanandzero(d#237))], LeftAnti, BuildRight, ((_1#220 = c#236) OR isnull((_1#220 = c#236))), false
      :- GlutenRowToArrowColumnar
      :  +- LocalTableScan [_1#220, _2#221]
      +- ColumnarBroadcastExchange HashedRelationBroadcastMode(List(knownfloatingpointnormalized(normalizenanandzero(input[1, double, true]))),false), [plan_id=282]
         +- *(5) ProjectExecTransformer [_1#231 AS c#236, _2#232 AS d#237]
            +- GlutenRowToArrowColumnar
               +- LocalTableScan [_1#231, _2#232]

UT to reproduce:

TEST_F(HashJoinTest, subqueryAnti) {
  auto l = makeRowVector(
      {"a", "b"},
      {
          makeNullableFlatVector<int32_t>({1, 1, 2, 2, 3, std::nullopt, std::nullopt, 6}),
          makeNullableFlatVector<double>({2.0, 2.0, 1.0, 1.0, 3.0, std::nullopt, 5.0, std::nullopt}),
      });

  auto r = makeRowVector(
      {"c", "d"},
      {
          makeNullableFlatVector<int32_t>({2, 2, 3, 4, std::nullopt, std::nullopt, 6}),
          makeNullableFlatVector<double>({3.0, 3.0, 2.0, 1.0, std::nullopt, 5.0, std::nullopt}),
      });

  createDuckDbTable("l", {l});
  createDuckDbTable("r", {r});

  // Plan anti join with a filter.
  auto planNodeIdGenerator = std::make_shared<core::PlanNodeIdGenerator>();
  auto plan = PlanBuilder(planNodeIdGenerator)
      .values({l})
      .project({"a", "b", "b + 10.0 as b1"})
      .hashJoin(
          {"b1"},
          {"d"},
          PlanBuilder(planNodeIdGenerator).values({r}).planNode(),
          "a = c or (a = c) is null",
          {"a", "b"},
          core::JoinType::kAnti)
      .planNode();

  HashJoinBuilder(*pool_, duckDbQueryRunner_, driverExecutor_.get())
      .planNode(std::move(plan))
      .referenceQuery(
          "select * from l where a not in (select c from r where d = b + 10.0)")
      .checkSpillStats(false)
      .run();
}

The query only outputs 2 rows whose join key is null, and omits other rows.
output:

DuckDB query: select * from l where a not in (select c from r where d = b + 10.0)
Google Test trace:
../velox/exec/tests/HashJoinTest.cpp:607: With Max Spill Level: 0
../velox/exec/tests/utils/QueryAssertions.cpp:734: Failure
Value of: false
  Actual: false
Expected: true
Expected 8, got 2
0 extra rows, 6 missing rows
0 of extra rows:

6 of missing rows:
	null | 5
	1 | 2
	1 | 2
	2 | 1
	2 | 1
	3 | 3

0 replies

mbasmanova · 2022-12-21T07:49:17Z

mbasmanova
Dec 21, 2022
Collaborator

@marin-ma For the anti join, NOT IN would map to kNullAwareAnti join type, not kAnti. See https://facebookincubator.github.io/velox/develop/anti-join.html#null-aware-anti-join

Would you try to use kNullAwareAnti join type and let us know if you still see incorrect results?

3 replies

marin-ma Dec 21, 2022
Author

I changed the join type to kNullAwareAnti in the above test case, and PlanNode execution output empty result and triggered the exception.

Testing started at 3:57 PM ...
unknown file: Failure
C++ exception with description "Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (0 vs. 2) Wrong number of columns
Retriable: False
Expression: rowType->size() == dataChunk->GetTypes().size()
Function: materialize
File: ../velox/exec/tests/utils/QueryAssertions.cpp
E1221 07:41:13.124799 1241761 Exceptions.h:68] Line: ../velox/exec/tests/utils/QueryAssertions.cpp:362, Function:materialize, Expression: rowType->size() == dataChunk->GetTypes().size() (0 vs. 2) Wrong number of columns, Source: RUNTIME, ErrorCode: INVALID_STATE
Line: 362
Stack trace:
# 0  
# 1  
# 2  
# 3  
# 4  
# 5  
# 6  
# 7  
# 8  
# 9  
# 10 
# 11 
# 12 
# 13 
# 14 
# 15 
# 16 
# 17 
# 18 
# 19 
# 20 
# 21 
# 22 
# 23 
# 24 
# 25 
# 26 
" thrown in the test body.

rui-mo Dec 21, 2022
Collaborator

Spark plans a BHJ with isNullAwareAntiJoin == true only when below conditions are met.

  if (isNullAwareAntiJoin) {
    require(leftKeys.length == 1, "leftKeys length should be 1")
    require(rightKeys.length == 1, "rightKeys length should be 1")
    require(joinType == LeftAnti, "joinType must be LeftAnti.")
    require(buildSide == BuildRight, "buildSide must be BuildRight.")
    require(condition.isEmpty, "null aware anti join optimize condition should be empty.")
  }

@marin-ma In the test you mentioned, there is some filter, and that's why we got a BHJ with isNullAwareAntiJoin == false. In this case, Velox kAnti was used.

marin-ma Dec 21, 2022
Author

The execution plan is generated by Spark, so it's not our decision to choose the join type. According to Spark's execution plan, the join type is LeftAnti, the join key is d = b+10.0 and join filter is (a = c or (a = c) is null), so I tried with DuckDB referenceQuery select * from l where not exists (select c from r where d = b + 10.0 and (a = c or (a = c) is null)) and the output is same as the original SQL.

mbasmanova · 2022-12-21T17:59:25Z

mbasmanova
Dec 21, 2022
Collaborator

@marin-ma @rui-mo Rong, Rui, I do see a bug in anti join with filter. Here is a fix: #3571.

1 reply

marin-ma Dec 22, 2022
Author

This fixed our case. Thank you!

mbasmanova · 2022-12-22T19:57:29Z

mbasmanova
Dec 22, 2022
Collaborator

Here is a side-by-side comparison of null-aware (IN) and regular (EXISTS) semantics.

0 replies

mbasmanova · 2022-12-22T23:57:20Z

mbasmanova
Dec 22, 2022
Collaborator

@marin-ma @rui-mo I have a draft PR for adding support for null-aware semi join project w/ filter: #3599

I'll be out till the end of the year. Will pick up this work when I'm back.

1 reply

rui-mo Dec 23, 2022
Collaborator

Masha, thank you! Merry Christmas and happy new year!

mbasmanova · 2023-01-05T00:47:06Z

mbasmanova
Jan 5, 2023
Collaborator

@marin-ma @rui-mo Happy New Year! PR #3599 adds support for null-aware semi join project with filter. Would you try it out to see whether it works for your use cases?

1 reply

marin-ma Jan 5, 2023
Author

Thank you, Masha. Happy New year! This can fix our use case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about existence join semantics #3555

{{title}}

Replies: 11 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Question about existence join semantics #3555

marin-ma Dec 20, 2022

Replies: 11 comments · 7 replies

marin-ma Dec 20, 2022 Author

marin-ma Dec 20, 2022 Author

mbasmanova Dec 20, 2022 Collaborator

mbasmanova Dec 20, 2022 Collaborator

mbasmanova Dec 21, 2022 Collaborator

rui-mo Dec 21, 2022 Collaborator

marin-ma Dec 21, 2022 Author

mbasmanova Dec 21, 2022 Collaborator

marin-ma Dec 21, 2022 Author

rui-mo Dec 21, 2022 Collaborator

marin-ma Dec 21, 2022 Author

mbasmanova Dec 21, 2022 Collaborator

marin-ma Dec 22, 2022 Author

mbasmanova Dec 22, 2022 Collaborator

mbasmanova Dec 22, 2022 Collaborator

rui-mo Dec 23, 2022 Collaborator

mbasmanova Jan 5, 2023 Collaborator

marin-ma Jan 5, 2023 Author

marin-ma
Dec 20, 2022

Replies: 11 comments 7 replies

marin-ma
Dec 20, 2022
Author

marin-ma
Dec 20, 2022
Author

mbasmanova
Dec 20, 2022
Collaborator

mbasmanova
Dec 20, 2022
Collaborator

mbasmanova
Dec 21, 2022
Collaborator

rui-mo Dec 21, 2022
Collaborator

marin-ma
Dec 21, 2022
Author

mbasmanova
Dec 21, 2022
Collaborator

marin-ma Dec 21, 2022
Author

rui-mo Dec 21, 2022
Collaborator

marin-ma Dec 21, 2022
Author

mbasmanova
Dec 21, 2022
Collaborator

marin-ma Dec 22, 2022
Author

mbasmanova
Dec 22, 2022
Collaborator

mbasmanova
Dec 22, 2022
Collaborator

rui-mo Dec 23, 2022
Collaborator

mbasmanova
Jan 5, 2023
Collaborator

marin-ma Jan 5, 2023
Author