change default dtype_backend for to_pandas #4815
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #4810
Fixes #3149
This is a breaking change:
dtype_backend
isnumpy_nullable
, which directs the conversion to automatically use Pandas Nullable Extension Types.dtype_backend
isnumpy_nullable
, the conversion exports the table data into a pyarrow table as an intermediate step and then converts the pyarrow table to a Pandas Dataframe. This is a more efficient and deterministic way to convert DH data and DH nulls, however DH server still has some gaps in mapping between DH data types and Arrow types (e.g. a column of PyObject will be converted to strings, java.time.LocalDate/LocalTime to bytes). This isn't yet a problem because the unsupported data types aren't what users normally would use and there is no complaints from them so far.