Releases: jorgecarleitao/parquet2
v0.17.0
A new release is out there!
Thanks everyone for the fixes and improvements resulting in this stabler, easier to use and faster version of parquet2!
Breaking changes:
- Improved hybrid rle decoding performance ~-40% #203 (ritchie46)
- Improved API to read column chunks #195 (jorgecarleitao)
- Removed
EncodedPage
#191 (jorgecarleitao)
New features:
- Added
serde
support forRowGroupMetaData
. #202 (youngsofun)
Fixed bugs:
- Removed un-necessary conversion #197 (jorgecarleitao)
- Avoid OOM on page streams #194 (jorgecarleitao)
- Fixed error requiring stats #193 (jorgecarleitao)
Enhancements:
- elide bound check in RLE decoder #201 (ritchie46)
- Replaced panics by errors on invalid pages #188 (evanrichter)
v0.16.0
Yet another release of parquet2, mostly focused on avoiding panics and oom. No impact on performance, but improves reliability.
v0.16.1 (2022-08-17)
Fixed bugs:
- Fixed error in
FilteredHybridBitmapIter
's trait bounds #187 (jorgecarleitao)
v0.16.0 (2022-08-17)
Breaking changes:
- Improved
Error
#181 (jorgecarleitao) - Made decoding fallible #178 (jorgecarleitao)
- Improved bitpacking #176 (jorgecarleitao)
New features:
- Added DELTA_BYTE_ARRAY encoder #183 (jorgecarleitao)
Fixed bugs:
- FixedLenByteArray max_precision integer overflow #184 (evanrichter)
Documentation updates:
- enable
doc_cfg
feature #186 (ritchie46) - Improved decoding documentation #180 (jorgecarleitao)
v0.15.0
We have a new release of parquet2 available!
Breaking changes:
- Add
max_size
toget_page_stream
andget_page_iterator
#173 - Optional
async
#174 (jorgecarleitao) - Privatized
CompressionLevel
#170 (jorgecarleitao) - Delay deserialization of dictionary pages #160 (jorgecarleitao)
New features:
Fixed bugs:
- Fixed OOM on malicious/malformed thrift #172 (jorgecarleitao)
Enhancements:
- Made
compute_page_row_intervals
public #171 (jorgecarleitao) - Simplified interal code #168 (jorgecarleitao)
- cargo fmt #166 (ritchie46)
Testing updates:
- Improved coverage report #175 (jorgecarleitao)
v0.14.2
A couple of bug fixes, by @jhorstmann and @v0y4g3r
Fixed bugs:
- Fixed FileStreamer's end method to flush Parquet magic #163 (v0y4g3r)
- Fix compilation of parquet-tools #161 (jhorstmann)
Enhancements:
- Added
Compressor::into_inner
#158 (jorgecarleitao)
v0.14.1
A small but important release to support legacy lz4, by @dantengsky 🚀
New features:
- Added support for legacy lz4 decompression #151 (dantengsky)
Enhancements:
- Improved performance of reading #157 (jorgecarleitao)
v0.14.0
A new release is here and in crates.io! 🎉🎉🎉
Breaking changes:
split_buffer
should returnResult
#156
Fixed bugs:
- Removed panics on read #150 (jorgecarleitao)
Enhancements:
- Reduced reallocations #153 (jorgecarleitao)
- Removed
AsyncSeek
requirement from page stream #149 (medwards)
v0.13.0
Another release of parquet2 is here!
We can now control the compression level of both GZIP and BROTLI compression thanks to @TurnOfACard 🙇
Thank you to everyone that contributed to this release!
Breaking changes:
- Removed unused cargo feature #145 (jorgecarleitao)
- Fix potential misuse of FileWriter API's (sync + async) #138 (TurnOfACard)
New features:
- Added new_with_page_meta to PageReader #136 (ygf11)
- Added compression options/levels for GZIP and BROTLI codecs. #132 (TurnOfACard)
Fixed bugs:
- Async FileStreamer does not write statistics #139
- Fixed error in compressing lz4raw with large offsets #140 (jorgecarleitao)
Enhancements:
- Improved read of metadata #143 (jorgecarleitao)
- Simplified async metadata read #137 (jorgecarleitao)
Testing updates:
- Lifted duplicated code to a function #141 (jorgecarleitao)
- Improved Integration test documentation and expanded tests #133 (TurnOfACard)
v0.12.1
Fixed bugs:
- Fixed error in compressing lz4raw with large offsets #140 (jorgecarleitao)
v0.12.0
Breaking changes:
- Add
CompressionOptions
, which allows for zstd compression levels. #128 (TurnOfACard)
Enhancements:
- Improved performance of RLE decoding (-18%) #130 (jorgecarleitao)
- Improved perf of bitpacking decoding (3.5x) #129 (jorgecarleitao)
v0.11.0
Here we are for a new release of parquet2. This release has 3 main features:
- added optional support LZ4 compression and decompression in WASM builds (via LZ4-flex by @PSeitz)
- added support to read bloom filters
- added support to read and write page indexes
A summary of the Full Changelog is available below.
Thank you for everyone that contributed to this release! (credits to individual PRs below)
Breaking changes:
- Renamed
ParquetError
toError
#109 - Made
.end
not consume the parquetFileWriter
#127 (jorgecarleitao) - Removed
compression
fromWriteOptions
#125 (kornholi) - Simplified API and converted some panics on read to errors #112 (jorgecarleitao)
- Improved typing to reduce clones and use of unwraps #106 (jorgecarleitao)
- Simplified
PageIterator
#103 (jorgecarleitao)
New features:
- Added support for page-level filter pushdown (indexes) #102
- Added support for bloom filters #98
- Added optional support for LZ4 via LZ4-flex crate (thus enabling wasm) #124 (jorgecarleitao)
- Added support for page-level filter pushdown (column and offset indexes) #107 (jorgecarleitao)
- Added support to read column and page indexes #100 (jorgecarleitao)
Fixed bugs:
- Fixed minimum version for LZ4 #122 (kornholi)
- Fixed Lz4Raw compression error (if input is tiny) #118 (dantengsky)
- Fixed LZ4 #95 (jorgecarleitao)
Enhancements:
- Made offsets be always written #123 (jorgecarleitao)
- Added specialized deserialization of one-level filtered pages #120 (jorgecarleitao)
- Added support to read and use bloom filters #99 (jorgecarleitao)
- Added
ordinal
andtotal_compressed_size
to column meta #96 (jorgecarleitao) - Added non-consuming function to get values of delta-decoder #94 (jorgecarleitao)
- Disabled bitpacking default-features and upgraded to edition 2021 #93 (light4)
Documentation updates:
- Fix deployment of guide #115 (jorgecarleitao)
Testing updates:
- Added tests for reducing statistics #116 (jorgecarleitao)
- Simplified tests #104 (jorgecarleitao)