Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decoding Union column with up to including 256 variants #27

Open
Jefffrey opened this issue Apr 1, 2024 · 1 comment
Open
Labels
enhancement New feature or request medium Medium priority

Comments

@Jefffrey
Copy link
Collaborator

Jefffrey commented Apr 1, 2024

According to https://orc.apache.org/specification/ORCv1/

Currently ORC union types are limited to 256 variants, which matches the Hive type model.

However in Arrow, UnionArrays are limited to 127 variants: https://arrow.apache.org/docs/format/Columnar.html#union-layout

A union with more than 127 possible types can be modeled as a union of unions.

To support this, would need to do as above and decode into union of union

See initial Union support here: datafusion-contrib/datafusion-orc@ee69b91

@Jefffrey Jefffrey added enhancement New feature or request medium Medium priority labels Apr 1, 2024
@waynexia waynexia transferred this issue from datafusion-contrib/datafusion-orc Oct 30, 2024
@suxiaogang223
Copy link
Contributor

I'd like to pick up this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request medium Medium priority
Projects
None yet
Development

No branches or pull requests

2 participants