-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet reader error: schemaElement.repetition_type == thrift::FieldRepetitionType::REPEATED #7777
Comments
qqibrow
added
bug
Something isn't working
triage
Newly created issue that needs attention.
labels
Nov 28, 2023
|
Looking into this |
facebook-github-bot
pushed a commit
that referenced
this issue
Dec 22, 2023
Summary: This PR fixes issue #7777 In Parquet, the map type is annotated as MAP converted type nomally. It should contain a repeated group annotated with MAP_KEY_VALUE, which in turn contains two children key and value: <map-repetition> group <name> (MAP) { repeated group key_value (MAP_KEY_VALUE) { required <key-type> key; <value-repetition> <value-type> value; } } But sometimes a group annotated with MAP_KEY_VALUE was incorrectly used in place of MAP. <map-repetition> group my_map (MAP_KEY_VALUE) { repeated group map { required binary key (UTF8); optional int32 value; } } For backward-compatibility, a MAP_KEY_VALUE that is not contained by MAP should be treated as MAP. This commit makes the following changes: 1. Adds a parentSchemaIdx to Parquet reader's getParquetColumnInfo() function to pass the parent schema. 2. Differenciate the situations where a MAP_KEY_VALUE's parent is or is not a MAP. If it is, then it should be the repeated group that contains the key and value. If it is not, it should be treated the same as MAP. For more information please check https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps Pull Request resolved: #7966 Reviewed By: pedroerp Differential Revision: D52263936 Pulled By: Yuhta fbshipit-source-id: 486b6167c76e613c604b309c02b785834ab050ac
Fixed by #7966 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug description
System information
Velox System Info v0.0.2
Commit: 1e186e548833750cdee4b95d829711ddad78aba1
CMake Version: 3.16.3
System: Linux-5.4.0-1063-aws
Arch: x86_64
C++ Compiler: /usr/bin/c++
C++ Compiler Version: 9.4.0
C Compiler: /usr/bin/cc
C Compiler Version: 9.4.0
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt
Relevant logs
The text was updated successfully, but these errors were encountered: