Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change default dtype_backend for to_pandas #4815

Merged
merged 3 commits into from
Nov 14, 2023

Conversation

jmao-denver
Copy link
Contributor

@jmao-denver jmao-denver commented Nov 12, 2023

Fixes #4810
Fixes #3149

This is a breaking change:

  1. the new default value for dtype_backend is numpy_nullable, which directs the conversion to automatically use Pandas Nullable Extension Types.
  2. when dtype_backend is numpy_nullable, the conversion exports the table data into a pyarrow table as an intermediate step and then converts the pyarrow table to a Pandas Dataframe. This is a more efficient and deterministic way to convert DH data and DH nulls, however DH server still has some gaps in mapping between DH data types and Arrow types (e.g. a column of PyObject will be converted to strings, java.time.LocalDate/LocalTime to bytes). This isn't yet a problem because the unsupported data types aren't what users normally would use and there is no complaints from them so far.

py/server/deephaven/pandas.py Outdated Show resolved Hide resolved
py/server/tests/test_parquet.py Outdated Show resolved Hide resolved
py/server/tests/test_parquet.py Show resolved Hide resolved
@jmao-denver jmao-denver requested a review from chipkent November 14, 2023 17:41
@jmao-denver jmao-denver merged commit 02c9deb into deephaven:main Nov 14, 2023
10 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 14, 2023
@deephaven-internal
Copy link
Contributor

Labels indicate documentation is required. Issues for documentation have been opened:

How-to: https://github.com/deephaven/deephaven.io/issues/3429
Reference: https://github.com/deephaven/deephaven.io/issues/3428

@jmao-denver jmao-denver deleted the 4810-string-round-trip-pandas branch December 18, 2023 17:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants