change default dtype_backend for to_pandas #4815

jmao-denver · 2023-11-12T17:27:58Z

Fixes #4810
Fixes #3149

This is a breaking change:

the new default value for dtype_backend is numpy_nullable, which directs the conversion to automatically use Pandas Nullable Extension Types.
when dtype_backend is numpy_nullable, the conversion exports the table data into a pyarrow table as an intermediate step and then converts the pyarrow table to a Pandas Dataframe. This is a more efficient and deterministic way to convert DH data and DH nulls, however DH server still has some gaps in mapping between DH data types and Arrow types (e.g. a column of PyObject will be converted to strings, java.time.LocalDate/LocalTime to bytes). This isn't yet a problem because the unsupported data types aren't what users normally would use and there is no complaints from them so far.

py/server/deephaven/pandas.py

py/server/tests/test_parquet.py

deephaven-internal · 2023-11-14T20:03:30Z

Labels indicate documentation is required. Issues for documentation have been opened:

How-to: https://github.com/deephaven/deephaven.io/issues/3429
Reference: https://github.com/deephaven/deephaven.io/issues/3428

change default dtype_backend for to_pandas

9df7c02

jmao-denver added python-server-side breaking ReleaseNotesNeeded Release notes are needed labels Nov 12, 2023

jmao-denver added this to the November 2023 milestone Nov 12, 2023

jmao-denver self-assigned this Nov 12, 2023

jmao-denver requested review from chipkent and rcaudy as code owners November 12, 2023 17:27

jmao-denver added the NoDocumentationNeeded label Nov 13, 2023

chipkent added DocumentationNeeded and removed NoDocumentationNeeded labels Nov 14, 2023

chipkent reviewed Nov 14, 2023

View reviewed changes

py/server/deephaven/pandas.py Outdated Show resolved Hide resolved

py/server/tests/test_parquet.py Outdated Show resolved Hide resolved

py/server/tests/test_parquet.py Show resolved Hide resolved

Improve docstrings and add comments in test code

7d468f4

jmao-denver requested a review from chipkent November 14, 2023 17:41

Accept suggested changes to docstrings

040028c

chipkent approved these changes Nov 14, 2023

View reviewed changes

jmao-denver merged commit 02c9deb into deephaven:main Nov 14, 2023
10 checks passed

github-actions bot locked and limited conversation to collaborators Nov 14, 2023

jmao-denver deleted the 4810-string-round-trip-pandas branch December 18, 2023 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change default dtype_backend for to_pandas #4815

change default dtype_backend for to_pandas #4815

jmao-denver commented Nov 12, 2023 •

edited

Loading

deephaven-internal commented Nov 14, 2023

change default dtype_backend for to_pandas #4815

change default dtype_backend for to_pandas #4815

Conversation

jmao-denver commented Nov 12, 2023 • edited Loading

deephaven-internal commented Nov 14, 2023

jmao-denver commented Nov 12, 2023 •

edited

Loading