Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Temporary fix Parquet metadata with empty value string being ignored …
…from writing (#14026) When writing to Parquet files, Spark needs to write pairs of key-value strings into files' metadata. Sometimes the value strings are just an empty string. Such empty string is ignored from writing into the file, causing other applications (such as Spark) to read the value and interpret it as a `null` instead of an empty string as in the original input, as described in #14024. This is wrong and led to data corruption as I tested. This PR intentionally modifies the empty value string into a space character to workaround the bug. This is a temporary fix while waiting for a better fix to be worked on. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #14026
- Loading branch information