-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV writer improvements #5604
CSV writer improvements #5604
Conversation
rcaudy
commented
Jun 11, 2024
- Change CSV writing code to use chunk-based reading code and stop allocating boxed primitives.
- Correct some trivial warnings.
- Fix column header separator-escaping bug.
…cating boxed primitives. Correct some trivial warnings. Fix column header separator-escaping bug.
Unit tests have been updated to provide full coverage for the new/changed code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that using Table.columnIterator
s would be more expensive because they don't use a shared context and have individual keys; but that makes me wish we had a iterator-like construction that could be as efficient as this.
ColumnsIterator it = table.columnsIterator();
OfByte bytes = it.byteColumn("Foo");
OfDouble doubles = it.doubleColumn("Bar");
while (it.hasNext()) {
byte myByte = bytes.getByte();
double myDouble = doubles.getDouble();
it.advance();
}
it.close()
Having something like this would make it easier IMO to splay out to these sorts of row-oriented formats without having to worry as much about low-level chunking details.
Happy to approve PR otherwise, but had a few Qs.
That's a clever idea. It might be a lot easier to use for many use cases. It obviously wouldn't be a Java |