-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): sync DataFusion for the latest SQL syntax #1017
feat(core): sync DataFusion for the latest SQL syntax #1017
Conversation
WalkthroughThis pull request introduces several modifications across the Wren Core project, focusing on SQL expression handling, dependency updates, and refinements in function signatures. The changes primarily involve updating the Changes
Sequence DiagramsequenceDiagram
participant Parser as SQL Parser
participant Dialect as WrenDialect
participant MDL as WrenMDL
Parser->>Dialect: Parse SQL Expression
Dialect->>Dialect: Process Array Access
Dialect->>Dialect: Handle Identifiers
Dialect->>MDL: Transform Expression
MDL->>MDL: Infer Source Column
MDL-->>Parser: Return Transformed Expression with Alias
Possibly Related PRs
Suggested Reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command 📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (3)
📒 Files selected for processing (3)
🔇 Additional comments (4)ibis-server/tests/conftest.py (1)
This increment aligns with the additional function(s) introduced in the codebase, based on the PR description. Ensure that related tests referencing this constant remain consistent with this updated count. ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1)
This change suggests that the function list now yields one less function when using the remote function list path. Verify that this is indeed the intended logic and that all references to this offset are appropriately updated in downstream tests or related modules. wren-core-py/tests/test_modeling_core.py (2)
This increment reflects an additional function being recognized by the system. Confirm that the relevant code accurately registers and handles all newly added functions.
This additional function count should remain consistent across the entire test suite. Double-check any references to 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
wren-core/Cargo.toml (1)
18-18
: Fix typo in branch name: "foward" should be "forward"The branch name contains a typo:
wren/foward-to-compund-access
should bewren/forward-to-compound-access
.-datafusion = { git = "https://github.com/goldmedal/datafusion.git", branch = "wren/foward-to-compund-access" } +datafusion = { git = "https://github.com/goldmedal/datafusion.git", branch = "wren/forward-to-compound-access" }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
wren-core/Cargo.toml
(1 hunks)wren-core/core/src/logical_plan/analyze/plan.rs
(1 hunks)wren-core/core/src/mdl/dialect/utils.rs
(3 hunks)wren-core/core/src/mdl/dialect/wren_dialect.rs
(3 hunks)wren-core/core/src/mdl/function.rs
(3 hunks)wren-core/core/src/mdl/mod.rs
(3 hunks)
🔇 Additional comments (14)
wren-core/core/src/mdl/mod.rs (3)
20-20
: Include ExprWithAlias
for advanced SQL expression handling
This import is necessary for parsing expressions with potential aliases. The change looks correct and aligns with the updated usage in the codebase.
217-221
: Utilize column alias if available during source column inference
Extracting expr
and alias
from ExprWithAlias
and defaulting to the column's original name if the alias is missing is a sensible approach. This logic ensures that user-defined aliases are preserved whenever possible.
233-233
: Confirm updated return type for SQL parsing
Changing the function signature to return ExprWithAlias
is a breaking change for callers expecting a simple Expr
. Please verify that all callers and downstream usages now accommodate the alias structure.
wren-core/core/src/mdl/dialect/utils.rs (3)
23-23
: Import of Span
from tokenizer
This addition appears necessary to populate the function or identifier span info. No issues detected.
59-59
: Set function name span to empty
Using Span::empty()
on the function name is appropriate for consistent construction of the Function
AST node when no precise span data is available.
71-71
: Explicitly disable ODBC syntax usage
Setting uses_odbc_syntax
to false
clarifies the function definition in contexts that need to avoid ODBC function parsing differences. Looks good.
wren-core/core/src/mdl/dialect/wren_dialect.rs (3)
26-27
: Expanded imports for advanced SQL constructs
Importing AccessExpr
, Array
, Value
, and Span
broadens handling of complex expressions in the dialect. This is consistent with new expression parsing requirements.
122-124
: Switch to CompoundFieldAccess
for array elements
The move from Subscript
to a CompoundFieldAccess
containing a subscript index is well-structured. It improves clarity for expressions accessing array elements and ensures future extensibility for nested access patterns.
188-188
: Provide empty span for quoted identifiers
Using Span::empty()
for identifier creation is consistent with the approach used elsewhere, ensuring that quoting logic remains unaffected by uninitialized span data.
wren-core/core/src/mdl/function.rs (3)
74-74
: Correct usage of TypeSignature::Nullary
Replacing NullAry
with Nullary
aligns with the revised DataFusion type signature enumeration. This fix prevents potential runtime mismatches.
118-118
: Match updated DataFusion signature conventions
As above, adding TypeSignature::Nullary
ensures the function signature is correctly recognized for no-argument function variants.
162-162
: Unified UDF signature definitions
Consistently applying TypeSignature::Nullary
in all custom UDF structs promotes uniformity and avoids unexpected errors during query planning.
wren-core/core/src/logical_plan/analyze/plan.rs (1)
296-296
: Initialize wildcard options as a boxed struct
Converting WildcardOptions::default()
into a boxed value supports recent DataFusion changes that expect a pointer-like structure. This update is safe and compatible with future expansions to wildcard behaviors.
wren-core/Cargo.toml (1)
18-18
: Verify fork's sync status with upstream DataFusion
The dependency points to a fork of DataFusion. Let's verify its sync status with the upstream repository to ensure we're not diverging significantly.
✅ Verification successful
Fork is relatively in sync with upstream DataFusion
The fork appears to be well-maintained:
- Created recently (May 2024) and last updated recently (Dec 2024)
- Only 5 commits behind the upstream main branch
- Contains 5 custom commits ahead of upstream, which is reasonable for a feature branch
The divergence is minimal and the fork is actively maintained, suggesting no immediate concerns with using this dependency.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the fork's sync status with upstream DataFusion
# Get the latest commit from the specified branch in the fork
echo "Latest commit in the fork:"
gh api repos/goldmedal/datafusion/branches/wren/foward-to-compund-access --jq '.commit.sha'
# Get the fork's creation date and last update
echo "Fork details:"
gh api repos/goldmedal/datafusion --jq '[.created_at, .updated_at, .parent.full_name]'
# Check if the fork is ahead/behind the upstream main branch
gh api repos/goldmedal/datafusion/compare/apache:datafusion:main...goldmedal:datafusion:wren/foward-to-compund-access --jq '[.ahead_by, .behind_by]'
Length of output: 546
Description
struct_col.f1.f2[1].f3