Skip to content

Commit

Permalink
allow map key to be optional
Browse files Browse the repository at this point in the history
  • Loading branch information
taiyang-li authored and Avogar committed Mar 31, 2023
1 parent 9ef654d commit c8cfbe2
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions cpp/src/parquet/arrow/schema.cc
Original file line number Diff line number Diff line change
Expand Up @@ -558,10 +558,23 @@ Status MapToSchemaField(const GroupNode& group, LevelInfo current_levels,
return Status::Invalid("Key-value map node must have 1 or 2 child elements. Found: ",
key_value.field_count());
}

/*
* If Parquet file was written by Flink, key type of map column is allowed to be optional, like this:
* optional group event_info (MAP) {
* repeated group key_value {
* optional binary key (UTF8);
* optional binary value (UTF8);
* }
* }
*
* Refer to: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/types/#constructured-data-types
const Node& key_node = *key_value.field(0);
if (!key_node.is_required()) {
return Status::Invalid("Map keys must be annotated as required.");
}
*/

// Arrow doesn't support 1 column maps (i.e. Sets). The options are to either
// make the values column nullable, or process the map as a list. We choose the latter
// as it is simpler.
Expand Down

0 comments on commit c8cfbe2

Please sign in to comment.