Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fixes and improved support for Parquet TIMESTAMP #4801

Merged
merged 5 commits into from
Nov 10, 2023

Conversation

malhotrashivam
Copy link
Contributor

@malhotrashivam malhotrashivam commented Nov 9, 2023

As part of #4421, we started throwing an exception on reading Parquet TIMESTAMP fields with isAdjustedToUTC set as false. After this change:

  • Such fields will be read as java.time.LocalDateTime
  • java.time.LocalDateTime columns will be written as Parquet TIMESTAMP fields with isAdjustedToUTC=false. Earlier they were written as binary data with a codec.

Also, this PR fixes the bugs introduced in #4775 and #4755 that can lead to a ClassCastException in some cases on reading Parquet DATE and TIME columns.

Related to #976

@malhotrashivam malhotrashivam added feature request New feature or request parquet Related to the Parquet integration NoDocumentationNeeded NoReleaseNotesNeeded No release notes are needed. labels Nov 9, 2023
@malhotrashivam malhotrashivam added this to the November 2023 milestone Nov 9, 2023
@malhotrashivam malhotrashivam self-assigned this Nov 9, 2023
jmao-denver
jmao-denver previously approved these changes Nov 9, 2023
Copy link
Contributor

@jmao-denver jmao-denver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python changes LGTM.

@malhotrashivam malhotrashivam changed the title Added support to read and write Parquet TIMESTAMP fields with isAdjustedToUTC=false Bug fixes and adding support for Parquet TIMESTAMP with isAdjustedToUTC=false Nov 10, 2023
@malhotrashivam malhotrashivam changed the title Bug fixes and adding support for Parquet TIMESTAMP with isAdjustedToUTC=false Bug fixes and improved support for Parquet TIMESTAMP Nov 10, 2023
@malhotrashivam malhotrashivam added the bug Something isn't working label Nov 10, 2023
@malhotrashivam malhotrashivam merged commit d162c89 into deephaven:main Nov 10, 2023
12 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 10, 2023
@@ -981,6 +982,21 @@ public static long epochNanos(@Nullable final ZonedDateTime dateTime) {
return safeComputeNanos(dateTime.toEpochSecond(), dateTime.getNano());
}

/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case my comments seem strong, this library is very exposed to users, so it needs to be ultra currated. Functions should only get added when there is a compelling reason.

  1. I am not a fan of having hard-coded methods for any specific timezone.
  2. The methods should accept a time zone as an input.
  3. If methods are added for LocalDateTime, LocalDateTime signatures should be added to all relevant methods.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working feature request New feature or request NoDocumentationNeeded NoReleaseNotesNeeded No release notes are needed. parquet Related to the Parquet integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants