Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly Join fuzzer consistent faluire 'duckdb::InternalException' values[i].type() == values[0].type() #7943

Closed
laithsakka opened this issue Dec 8, 2023 · 13 comments
Assignees
Labels
bug Something isn't working fuzzer Issues related the to Velox fuzzer test components.

Comments

@laithsakka
Copy link
Contributor

Description

-- HashJoin[ANTI t2=u2 AND t0=u0 AND t1=u1] -> tp5:ARRAY<BOOLEAN>, t0:BOOLEAN, t2:BIGINT
  -- Values[500 rows in 5 vectors] -> t0:BOOLEAN, t1:BIGINT, t2:BIGINT, tp3:MAP<INTEGER,DATE>, tp4:ARRAY<SMALLINT>, tp5:ARRAY<BOOLEAN>
  -- Values[0 rows in 1 vectors] -> u0:BOOLEAN, u1:BIGINT, u2:BIGINT
I1207 03:43:03.596397 33595 Task.cpp:1062] All drivers (2) finished for task test_cursor 1051 after running for 2 ms.
I1207 03:43:03.596410 33595 Task.cpp:1746] Terminating task test_cursor 1051 with state Finished after running for 2 ms.
I1207 03:43:03.596906 28229 JoinFuzzer.cpp:299] Results: [ROW ROW<tp5:ARRAY<BOOLEAN>,t0:BOOLEAN,t2:BIGINT>: 500 elements, no nulls]
terminate called after throwing an instance of 'duckdb::InternalException'
  what():  INTERNAL Error: Assertion triggered in file "/home/runner/work/velox/velox/_build/debug/_deps/duckdb-src/src/common/types/value.cpp" on line 702: values[i].type() == values[0].type()
*** Aborted at 1701920583 (Unix time, try 'date -d @1701920583') ***
*** Signal 6 (SIGABRT) (0x3e800006e45) received by PID 28229 (pthread TID 0x7fea659cbbc0) (linux TID 28229) (maybe from PID 28229, UID 1000) (code: -6), stack trace: ***

Error Reproduction

https://github.com/facebookincubator/velox/actions/runs/7123232991/job/19395529656

Relevant logs

No response

@laithsakka laithsakka added bug Something isn't working fuzzer Issues related the to Velox fuzzer test components. fuzzer-found labels Dec 8, 2023
@mbasmanova
Copy link
Contributor

CC: @majetideepak @pedroerp

@mbasmanova
Copy link
Contributor

Tried to reproduce using seed on my Mac. The query plan looks the same, but it didn't fail.

I20231209 05:21:00.261693 7814016 JoinFuzzer.cpp:867] ==============================> Started iteration 0 (seed: 3802709900)
I20231209 05:21:00.272934 7814016 JoinFuzzer.cpp:281] Executing query plan: 
-- HashJoin[ANTI t0=u0 AND t1=u1 AND t2=u2] -> t0:BOOLEAN, tp3:MAP<DATE,INTEGER>
  -- Values[500 rows in 5 vectors] -> t0:BOOLEAN, t1:BIGINT, t2:BIGINT, tp3:MAP<DATE,INTEGER>, tp4:ARRAY<SMALLINT>, tp5:ARRAY<BOOLEAN>
  -- Values[0 rows in 1 vectors] -> u0:BOOLEAN, u1:BIGINT, u2:BIGINT
I20231209 05:21:00.278057 7814025 Task.cpp:1062] All drivers (2) finished for task test_cursor 1 after running for 4 ms.
I20231209 05:21:00.278100 7814025 Task.cpp:1746] Terminating task test_cursor 1 with state Finished after running for 4 ms.
I20231209 05:21:00.279109 7814016 JoinFuzzer.cpp:299] Results: [ROW ROW<t0:BOOLEAN,tp3:MAP<DATE,INTEGER>>: 500 elements, no nulls]

@mbasmanova
Copy link
Contributor

Also tried on a linux box (devserver). Still no error. Wondering if somehow duckdb version installed in CI is different. CC: @kgpai

I1209 02:22:57.681602 2065943 JoinFuzzer.cpp:867] ==============================> Started iteration 0 (seed: 3802709900)
I1209 02:22:57.717593 2065943 JoinFuzzer.cpp:281] Executing query plan:
-- HashJoin[ANTI t2=u2 AND t0=u0 AND t1=u1] -> tp5:ARRAY<BOOLEAN>, t0:BOOLEAN, t2:BIGINT
  -- Values[500 rows in 5 vectors] -> t0:BOOLEAN, t1:BIGINT, t2:BIGINT, tp3:MAP<DATE,INTEGER>, tp4:ARRAY<SMALLINT>, tp5:ARRAY<BOOLEAN>
  -- Values[0 rows in 1 vectors] -> u0:BOOLEAN, u1:BIGINT, u2:BIGINT
I1209 02:22:57.733266 2146540 Task.cpp:1062] All drivers (2) finished for task test_cursor 1 after running for 13 ms.
I1209 02:22:57.733325 2146540 Task.cpp:1746] Terminating task test_cursor 1 with state Finished after running for 13 ms.
I1209 02:22:57.736086 2065943 JoinFuzzer.cpp:299] Results: [ROW ROW<tp5:ARRAY<BOOLEAN>,t0:BOOLEAN,t2:BIGINT>: 500 elements, no nulls]

@mbasmanova
Copy link
Contributor

Yesterday's failure:

https://github.com/facebookincubator/velox/actions/runs/7456109693/job/20287244934

I0109 04:09:26.973651  5227 JoinFuzzer.cpp:867] ==============================> Started iteration 64 (seed: 2398499686)
I0109 04:09:26.975841  5227 JoinFuzzer.cpp:281] Executing query plan: 
-- HashJoin[LEFT t0=u0 AND t1=u1] -> u0:DOUBLE, u1:BIGINT, t1:BIGINT, tp2:ARRAY<BOOLEAN>
  -- Values[500 rows in 5 vectors] -> t0:DOUBLE, t1:BIGINT, tp2:ARRAY<BOOLEAN>
  -- Values[55 rows in 5 vectors] -> u0:DOUBLE, u1:BIGINT, bp2:ARRAY<BOOLEAN>
I0109 04:09:26.978204  8093 Task.cpp:1082] All drivers (2) finished for task test_cursor 904 after running for 2 ms.
I0109 04:09:26.978214  8093 Task.cpp:1766] Terminating task test_cursor 904 with state Finished after running for 2 ms.
I0109 04:09:26.978935  5227 JoinFuzzer.cpp:299] Results: [ROW ROW<u0:DOUBLE,u1:BIGINT,t1:BIGINT,tp2:ARRAY<BOOLEAN>>: 503 elements, no nulls]
terminate called after throwing an instance of 'duckdb::InternalException'
  what():  INTERNAL Error: Assertion triggered in file "/home/runner/work/velox/velox/velox/_build/debug/_deps/duckdb-src/src/common/types/value.cpp" on line 702: values[i].type() == values[0].type()
*** Aborted at 1704773366 (Unix time, try 'date -d @1704773366') ***
*** Signal 6 (SIGABRT) (0x3e90000146b) received by PID 5227 (pthread TID 0x7f00686932c0) (linux TID 5227) (maybe from PID 5227, UID 1001) (code: -6), stack trace: ***

@mbasmanova
Copy link
Contributor

mbasmanova commented Jan 10, 2024

The error happens while creating DuckDB table with data that includes an ARRAY(BOOLEAN).

  ::duckdb::vector<::duckdb::Value> array;
  array.reserve(size);
  for (auto i = 0; i < size; i++) {
    auto innerRow = offset + i;
    if (elements->isNullAt(innerRow)) {
      array.emplace_back(
          ::duckdb::Value(duckdb::fromVeloxType(elements->type())));
    } else {
      array.emplace_back(VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH(
          duckValueAt, elements->typeKind(), elements, innerRow));
    }
  }

  return ::duckdb::Value::LIST(array); // The error happens here. 

@mbasmanova
Copy link
Contributor

template <TypeKind kind>
::duckdb::Value duckValueAt(const VectorPtr& vector, vector_size_t index) {
  using T = typename KindToFlatVector<kind>::WrapperType;
  return ::duckdb::Value(vector->as<SimpleVector<T>>()->valueAt(index));
}

I do not see duckdb::Value(bool) signature and wonder if we need to use duckdb::BOOLEAN(true|false) instead.

@mbasmanova
Copy link
Contributor

@kgpai Krishna is adding Join Fuzzer to experimental jobs in #8319. This should allow then to experiment with this failure by running adhoc jobs from https://github.com/facebookincubator/velox/actions/workflows/experimental.yml

@mbasmanova
Copy link
Contributor

Instructions to reproduce CI failures using Docker: #8371

@mbasmanova
Copy link
Contributor

mbasmanova commented Jan 17, 2024

DuckDB failure here comes from debug-only check. Wondering if there is any particular reason for building DuckDB in debug mode. CC: @kgpai

@mbasmanova
Copy link
Contributor

mbasmanova commented Jan 17, 2024

The error comes from debug check in DuckDB: src/common/types/value.cpp

Value Value::LIST(vector<Value> values) {
	if (values.empty()) {
		throw InternalException("Value::LIST without providing a child-type requires a non-empty list of values. Use "
		                        "Value::LIST(child_type, list) instead.");
	}
#ifdef DEBUG
	for (idx_t i = 1; i < values.size(); i++) {
		D_ASSERT(values[i].type() == values[0].type());
	}
#endif
	Value result;
	result.type_ = LogicalType::LIST(values[0].type());
	result.value_info_ = make_shared<NestedValueInfo>(std::move(values));
	result.is_null = false;
	return result;
}

The check failure is caused by duckValueAt<TypeKind::ARRAY> in QueryAssertions.cpp calling duckdb::Value::LIST with a vector of values of different type.

template <>
::duckdb::Value duckValueAt<TypeKind::ARRAY>(
    const VectorPtr& vector,
    int32_t row) {
  auto arrayVector = vector->wrappedVector()->as<ArrayVector>();
  auto arrayRow = vector->wrappedIndex(row);
  auto& elements = arrayVector->elements();
  auto offset = arrayVector->offsetAt(arrayRow);
  auto size = arrayVector->sizeAt(arrayRow);

  if (size == 0) {
    return ::duckdb::Value::EMPTYLIST(duckdb::fromVeloxType(elements->type()));
  }

  ::duckdb::vector<::duckdb::Value> array;
  array.reserve(size);
  for (auto i = 0; i < size; i++) {
    auto innerRow = offset + i;
    if (elements->isNullAt(innerRow)) {
      array.emplace_back(
          ::duckdb::Value(duckdb::fromVeloxType(elements->type())));
    } else {
      array.emplace_back(VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH(
          duckValueAt, elements->typeKind(), elements, innerRow));
    }
  }

  return ::duckdb::Value::LIST(array);
}

duckValueAt template called with BOOLEAN, TINYINT and SMALLINT values ends up creating a value of type INTEGER. It uses duckdb::Value API which is not defined for bool, int8_t and int16_t, but defined for int32_t and other types. Hence, compiler silently casts bool, int8_t and int16_t into int32_t and calls duckdb::Value(int32_t) which returns a value of type INTEGER.

template <TypeKind kind>
::duckdb::Value duckValueAt(const VectorPtr& vector, vector_size_t index) {
  using T = typename KindToFlatVector<kind>::WrapperType;
  return ::duckdb::Value(vector->as<SimpleVector<T>>()->valueAt(index));
}

However, null values are converted using :duckdb::Value(duckdb::fromVeloxType(elements->type()))) which takes type explicitly. Hence, when there is a mix of null and non-null values of type BOOLEAN, TINYINT or SMALLINT we end up with a list of values of different types and hit the debug check in DuckDB.

A fix could be to provide explicit overrides for these types:

template <>
::duckdb::Value duckValueAt<TypeKind::TINYINT>(
    const VectorPtr& vector,
    vector_size_t index) {
  return ::duckdb::Value::TINYINT(
      vector->as<SimpleVector<int8_t>>()->valueAt(index));
}

template <>
::duckdb::Value duckValueAt<TypeKind::SMALLINT>(
    const VectorPtr& vector,
    vector_size_t index) {
  return ::duckdb::Value::TINYINT(
      vector->as<SimpleVector<int16_t>>()->valueAt(index));
}

template <>
::duckdb::Value duckValueAt<TypeKind::BOOLEAN>(
    const VectorPtr& vector,
    vector_size_t index) {
  return ::duckdb::Value::BOOLEAN(
      vector->as<SimpleVector<bool>>()->valueAt(index));
}

or use a version of duckdb::Value::LIST that takes an additional 'type' parameter and casts values to that type. I verified that this option allows the join fuzzer in CI to run successfully.

::duckdb::Value::LIST(duckdb::fromVeloxType(elements->type()), array);

@mbasmanova
Copy link
Contributor

@mbasmanova
Copy link
Contributor

I believe I tracked down this issue being introduced in #6725: https://github.com/facebookincubator/velox/pull/6725/files#r1455417084

CC: @majetideepak

@majetideepak
Copy link
Collaborator

@mbasmanova thanks for investigating this tricky issue.
@kgpai, we should include DuckDB in the system since we are building a new image here #8270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fuzzer Issues related the to Velox fuzzer test components.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants