-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: reads using global ctx #982
base: main
Are you sure you want to change the base?
Conversation
a29f103
to
162c65a
Compare
a6fd8e4
to
f7af294
Compare
|
||
from datafusion.dataframe import DataFrame | ||
from datafusion.expr import Expr | ||
import pyarrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: it would be great to use ruff (https://stackoverflow.com/a/77876298) or isort to deterministically and programmatically sort python imports, and validate that in CI. I think isort/ruff would have a newline here between the third-party and first-party imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there a pre-commit config for ruff linter and formatter
datafusion-python/.pre-commit-config.yaml
Lines 23 to 30 in 79c22d6
- repo: https://github.com/astral-sh/ruff-pre-commit | |
# Ruff version. | |
rev: v0.3.0 | |
hooks: | |
# Run the linter. | |
- id: ruff | |
# Run the formatter. | |
- id: ruff-format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the SO answer above explains, import sorting isn't currently part of the default ruff-format
behavior. We'd need to opt-in by adding an I
element here:
datafusion-python/pyproject.toml
Line 66 in 79c22d6
select = ["E4", "E7", "E9", "F", "D", "W"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to this addition, but there is a potential source of confusion that we can mitigate with documentation. If a new user creates a session context themself and registers functions, and then creates a dataframe using this method, the functions they registered will not be available. I think it could lead to a fair amount of confusion.
I think this is easily mitigated by adding documentation to these functions that describes that it uses a default global session context and if the user needs a custom context they need to use the functions .
Which issue does this PR close?