Skip to content

Commit

Permalink
Fix nullable bug of Arrow MapVector in Bridge.cpp (#11214)
Browse files Browse the repository at this point in the history
Summary:
There is a constraint defined in [arrow/format/Schema.fbs](https://github.com/apache/arrow/blob/a153dcb7ae1f61e705472574a5890b33db8cb76a/format/Schema.fbs#L138), which requires the MapVector itself and the key of MapVector to be not nullable.

> /// Neither the "entries" field nor the "key" field may be nullable.

There are also two not-nullable constraints in the Java implementation of MapVector:
1. Map data should be a non-nullable struct type;
https://github.com/apache/arrow/blob/d4516c5386f84619dfdf2a9f72fed6d7df89704c/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java#L98
2. Map data key type should be a non-nullable;
https://github.com/apache/arrow/blob/d4516c5386f84619dfdf2a9f72fed6d7df89704c/java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java#L105

However, When the velox convert the MapVector to Arrow MapVector, velox will set the flag of MapVector to nullable, which violates these constraints: https://github.com/facebookincubator/velox/blob/acd57170b6d98206def1ef02b74c467e3ba18061/velox/vector/arrow/Bridge.cpp#L1388

Pull Request resolved: #11214

Reviewed By: xiaoxmeng

Differential Revision: D64520267

Pulled By: pedroerp

fbshipit-source-id: adcb140e17e422850e3d3fcc61ead31db8f152e6
  • Loading branch information
whutjs authored and facebook-github-bot committed Oct 24, 2024
1 parent c14040f commit c5dc89d
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 0 deletions.
8 changes: 8 additions & 0 deletions velox/vector/arrow/Bridge.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ namespace {
// and one for offsets (2).
static constexpr size_t kMaxBuffers{3};

void clearNullableFlag(int64_t& flags) {
flags = flags & (~ARROW_FLAG_NULLABLE);
}

// Structure that will hold the buffers needed by ArrowArray. This is opaquely
// carried by ArrowArray.private_data
class VeloxToArrowBridgeHolder {
Expand Down Expand Up @@ -1446,6 +1450,10 @@ void exportToArrow(
maps.getNullCount());
exportToArrow(rows, *child, options);
child->name = "entries";
// Map data should be a non-nullable struct type.
clearNullableFlag(child->flags);
// Map data key type should be non-nullable.
clearNullableFlag(child->children[0]->flags);
bridgeHolder->setChildAtIndex(0, std::move(child), arrowSchema);

} else if (type->kind() == TypeKind::ARRAY) {
Expand Down
5 changes: 5 additions & 0 deletions velox/vector/arrow/tests/ArrowBridgeSchemaTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,11 @@ class ArrowBridgeSchemaExportTest : public testing::Test {
EXPECT_STREQ("+m", schema->format);
ASSERT_EQ(schema->n_children, 1);
schema = schema->children[0];
// Map data should be a non-nullable struct type
ASSERT_EQ(schema->flags & ARROW_FLAG_NULLABLE, 0);
ASSERT_EQ(schema->n_children, 2);
// Map data key type should be a non-nullable
ASSERT_EQ(schema->children[0]->flags & ARROW_FLAG_NULLABLE, 0);
} else if (type->kind() == TypeKind::ROW) {
EXPECT_STREQ("+s", schema->format);
}
Expand Down

0 comments on commit c5dc89d

Please sign in to comment.