Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove DataFusion pyarrow feat #1000

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

timsaucer
Copy link
Contributor

Which issue does this PR close?

This addresses part of apache/datafusion#14197

Rationale for this change

By removing the pyarrow dependency of DataFusion we can update pyo3 in without requiring corresponding updates to the DataFusion core repository. This does add in a few additional pieces, such as adding a wrapper around ScalarValue, but it will simplify the core DataFusion repo to not have pyo3 in it.

What changes are included in this PR?

  • Removes pyarrow feature of DataFusion core repo
  • Adds PyScalarValue which is a simple wrapper on ScalarValue so we can do things like implement traits on it that are currently implemented upstream in DataFusion.
  • Renames DataFusionError to PyDataFusionError so there is not confusion with the enum defined upstream.

Are there any user-facing changes?

No user facing changes.

@kylebarron
Copy link
Contributor

By removing the pyarrow dependency of DataFusion we can update pyo3 in without requiring corresponding updates to the DataFusion core repository.

FWIW this is also one of the reasons why I created pyo3-arrow: https://docs.rs/pyo3-arrow/latest/pyo3_arrow/#why-not-use-arrow-rss-python-integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants