You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
does not distinguish between signed and unsigned integer types.
This leads to the following behaviour, I think it would be nice if that was more consistent.
import pyarrow as pa
import pandas as pd
data_1 = [{"a": pow(2, 63) - 1}]
schema_1 = pa.Schema.from_pandas(pd.DataFrame(data_1))
print(schema_1) # takes a different codepath, correctly infers uint64
data_2 = [{"a": [pow(2, 63) - 1]}]
schema_2 = pa.Schema.from_pandas(pd.DataFrame(data_2)) # crashes
Here's the backtrace that you get when trying to compute schema_2.
Traceback (most recent call last):
File "/work/arrow/foo.py", line 5, in <module>
schema = pa.Schema.from_pandas(pd.DataFrame(data))
File "pyarrow/types.pxi", line 3104, in pyarrow.lib.Schema.from_pandas
File "/work/arrow/pyarrow-dev/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 562, in dataframe_to_types
type_ = pa.array(c, from_pandas=True).type
File "pyarrow/array.pxi", line 360, in pyarrow.lib.array
File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status
OverflowError: Python int too large to convert to C long
Is that something that can be changed or would that likely have too many unintended consequences?
I've tested this with pyarrow version 19.0.0 on ubuntu 24.04.
Component(s)
C++
The text was updated successfully, but these errors were encountered:
Describe the enhancement requested
The type inference for schema detection that's implemented here
arrow/python/pyarrow/src/arrow/python/inference.cc
Line 493 in 9801801
This leads to the following behaviour, I think it would be nice if that was more consistent.
Here's the backtrace that you get when trying to compute
schema_2
.Is that something that can be changed or would that likely have too many unintended consequences?
I've tested this with pyarrow version 19.0.0 on ubuntu 24.04.
Component(s)
C++
The text was updated successfully, but these errors were encountered: