-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better E2EFilterTest for parquet in velox #7478
Comments
We've implemented testing velox parquet using presto unit test. Here is the mini design doc: https://gist.github.com/qqibrow/689ed97b91cc0b58337be96a86291301 |
Hi @qqibrow , we also encountered data incorrectness when reading Map types, wonder if you also met such bugs, thanks! |
@qqibrow This is a very informative post, thanks! What's the status of Parquet support in Velox from your perspective? |
@qqibrow Let's use this issue for enhancing the Velox e2eFilterTest. Adding some contents here: Description With Varint or not (DWRF only) Different compression schemes |
Description
Hi, we recently started evaluating velox and notice there are issues in parquet reader. We generate some parquet test files used in presto unit test and try read them using native parquet reader in velox. there are lots of test failures:
( name
testArrayOfMaps
means the test files are generated from testArrayOfMaps() in AbstractTestParquetReader )There are two errors: one is #7002 and the other one is:
for the time being we only tested meta data part not touch data correctness yet.
Can we think of building a better unit test infra for parquet? Is there some unit test best practice we can learn from orc support in velox? e.g, leveraging existing parquet testing in presto project is a good direction. To better test complex types, there is even a customized hive parquet writer (e.g, SingleLevelArraySchemaConverter) helps surfacing issues (this is the reason why #7002 is blocked at first because it's hard to reproduce)
The text was updated successfully, but these errors were encountered: