You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
During converting data between arrow and cudf, from_arrow converts dictionary indices to unsigned types. libcudf dictionary indices become unsigned int even if arrow has signed int.
Since unsigned integers can be more difficult to work with in some cases (e.g. in the JVM), we recommend preferring signed integers over unsigned integers for representing dictionary indices. Additionally, we recommend avoiding using 64-bit unsigned integer indices unless they are required by an application.
Starting a discussion here to update this preference.
Additional context
Velox does not support unsigned indices in dictionary column. while round tripping cudf table, this issue was found.
The text was updated successfully, but these errors were encountered:
I believe we could change libcudf dictionary to support signed indices only without breaking anyone honestly.
Or we could consider dropping dictionary support (throwing an exception) in the arrow interop since we cannot guarantee the type even works appropriately with all of libcudf at this point.
Describe the bug
During converting data between arrow and cudf,
from_arrow
converts dictionary indices to unsigned types. libcudf dictionary indices become unsigned int even if arrow has signed int.cudf/cpp/src/interop/from_arrow_host.cu
Lines 269 to 280 in 4cd40ee
But this causes issues with round tripping to exact types.
https://arrow.apache.org/docs/format/Columnar.html#dictionary-encoded-layout:~:text=Since%20unsigned%20integers,by%20an%20application.
Also the arrow docs prefers to use signed types.
Starting a discussion here to update this preference.
Additional context
Velox does not support unsigned indices in dictionary column. while round tripping cudf table, this issue was found.
The text was updated successfully, but these errors were encountered: