Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop copying LogicalPlan and Exprs in EliminateNestedUnion #10319

Merged
merged 4 commits into from
May 2, 2024

Conversation

emgeee
Copy link
Contributor

@emgeee emgeee commented Apr 30, 2024

Which issue does this PR close?

Closes #10296.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Existing tests pass

Are there any user-facing changes?

@github-actions github-actions bot added the optimizer Optimizer rules label Apr 30, 2024
@emgeee
Copy link
Contributor Author

emgeee commented Apr 30, 2024

In addition to updating the interface for this optimizer, I tried to remove as many copies as possible but I'm still new to rust + datafusion so I'm very much open to suggestions and feedback!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you @emgeee

I think this is a step forward for sure

One thing I noticed is that

fn extract_plans_from_union(plan: &Arc<LogicalPlan>) -> Vec<Arc<LogicalPlan>> {
    match plan.as_ref() {
        LogicalPlan::Union(Union { inputs, schema }) => inputs
            .iter()
            .map(|plan| Arc::new(coerce_plan_expr_for_schema(plan, schema).unwrap()))
            .collect::<Vec<_>>(),
        _ => vec![plan.clone()],
    }
}

Calls coerce_plan_expr_for_schema which looks like it may still clone the input and exprs.

Maybe we can try to get rid of those clones too (but as a follow on PR)

datafusion/optimizer/src/eliminate_nested_union.rs Outdated Show resolved Hide resolved
@alamb alamb marked this pull request as draft April 30, 2024 21:11
@alamb
Copy link
Contributor

alamb commented Apr 30, 2024

Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look

@alamb
Copy link
Contributor

alamb commented Apr 30, 2024

(also looks like there are some tests that need to be fixed)

@github-actions github-actions bot added the logical-expr Logical plan and expressions label May 1, 2024
@emgeee
Copy link
Contributor Author

emgeee commented May 1, 2024

I went ahead and fixed the formatting so all tests pass and managed to remove 1 clone() call from with in the coerce_plan_expr_for_schema() call stack.

If we want to optimize coerce_plan_expr_for_schema further, I'd definitely agree a separate PR makes sense as it is called from a number of other locations as well

@emgeee emgeee marked this pull request as ready for review May 1, 2024 22:24
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you for this contribution @emgeee -- a very nice first contribution :bowtie: .

If we want to optimize coerce_plan_expr_for_schema further, I'd definitely agree a separate PR makes sense as it is called from a number of other locations as well

I agree -- this would be great to remove additional copies. If you had the time PRs would be most appreciated 🙏

@@ -250,7 +250,7 @@ fn coerce_exprs_for_schema(
_ => expr.cast_to(new_type, src_schema),
}
} else {
Ok(expr.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@alamb alamb merged commit d4da80b into apache:main May 2, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented May 2, 2024

Thanks again @emgeee

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop copying LogicalPlan and Exprs in EliminateNestedUnion
2 participants