-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Parquet format output not working in CLI for show commands #25997
base: main
Are you sure you want to change the base?
fix: Parquet format output not working in CLI for show commands #25997
Conversation
BREAKING CHANGE: The short option -o, previously used for order-by in the table-list command, has been replaced and is now used for the output option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He @Karribalu - I have a couple of suggestions in line. I think the breaking change is okay given the prior discussion about it, and that this makes it consistent with other CLIs that output parquet.
if let Some(path) = output_file_path { | ||
let mut f = OpenOptions::new() | ||
.write(true) | ||
.create(true) | ||
.truncate(true) | ||
.open(path) | ||
.await?; | ||
f.write_all_buf(&mut bs).await?; | ||
} else { | ||
if output_format.is_parquet() { | ||
Err(Error::NoOutputFileForParquet)? | ||
} | ||
println!("{}", String::from_utf8(bs.as_ref().to_vec()).unwrap()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this code into a helper function since it is re-used in several places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a success test case that writes the parquet to a temp file, then reads it, and validates its contents.
Some helpful APIs that would enable that:
- We use the
tempfile
crate for temporary files in tests - There are APIs for reading parquet files into Arrow
RecordBatch
s in theparquet
crate - There are helpers for visually asserting on the contents of those record batches in DataFusion, e.g.,
assert_batches_sorted_eq
BREAKING CHANGE:
The short option -o, previously used for order-by in the table-list command, has been replaced and is now used for the output option.
Closes #25941
Describe your proposed changes here.