Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move create_physical_expr to phy-expr-common #1 #10144

Closed
wants to merge 10 commits into from

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Apr 20, 2024

Which issue does this PR close?

Part 1 of #10074

I plan to move it in several PRs

Most of the code is for binary.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the physical-expr Physical Expressions label Apr 20, 2024
@jayzhan211 jayzhan211 changed the title Move create_physical_expr to phy-expr-common Move create_physical_expr to phy-expr-common #1 Apr 20, 2024
@jayzhan211 jayzhan211 marked this pull request as ready for review April 20, 2024 08:29
alamb
alamb previously approved these changes Apr 21, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jayzhan211 -- this looks like nice progress to me. I had a few questions / comments but in general very nice. It will be amazing to see the aggregate / window functions pulled into their own crates and the APIs clarified

pub mod column;
pub mod datum;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it makes sense to have the datum / scalars / cast stuff in physical expr common

@@ -0,0 +1,362 @@
// Licensed to the Apache Software Foundation (ASF) under one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise I think it makes sense to have the interval arithmetic in physical-expr-common too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that so, can Expr's still access interval arithmetic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I didn't miss any pub use to make it backward compatible, there is no change from others

fn reverse_tuple<T, U>((first, second): (T, U)) -> (U, T) {
(second, first)
}

#[cfg(test)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason this entire file wasn't migrated over?

Copy link
Contributor Author

@jayzhan211 jayzhan211 Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I need all of them, so I only move what I need for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExprIntervalGraphNode is not used for creating expr, but in execution plan

@@ -106,6 +106,14 @@ pub fn create_physical_expr(
input_dfschema: &DFSchema,
execution_props: &ExecutionProps,
) -> Result<Arc<dyn PhysicalExpr>> {
use datafusion_physical_expr_common::physical_expr::create_physical_expr as create_physical_expr_common;

// Temporary solution, after all the logic is moved to common, we can remove this function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a ticket reference in this comment too so anyone else who stumbles on this can review the backstory / reference?

if res.is_ok() {
return res;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the cases below (like Expr::Alias) that were moved, can we please remove them from this list? I think it would make it clearer that that code is now dead (and make it easier to track what remains to port

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible yet since create_physical_expr_common failed most of the time because many other expression is not yet supported, so we fall back to the original one.

Copy link
Contributor Author

@jayzhan211 jayzhan211 Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easier to track what remains to port

I think we can check what is still commented out in common

@@ -23,8 +23,8 @@ use std::{any::Any, sync::Arc};
use crate::expressions::datum::{apply, apply_cmp};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope / dream for the binary operators are that over time we would be able to extract them into functions (e.g so we would have a + b --> add(a, b)

However, I can't think if any good justification for this yet and I think we have bigger things to focus on than the semantics of operators

/// * `input_dfschema` - The DataFusion schema for the input, used to resolve `Column` references
/// to qualified or unqualified fields by name.
#[allow(clippy::only_used_in_recursion)]
pub fn create_physical_expr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move all the the code that is commented out in this function to physical-expr-common what will be left in physical-expr?

Copy link
Contributor Author

@jayzhan211 jayzhan211 Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move all the the code that is commented out in this function to physical-expr-common what will be left in physical-expr?

code for physical-plan are not. For example, ExprIntervalGraph in cp_solver is not moved, it is used in physical-plan

Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 force-pushed the move-phy-expr-to-common branch from d79775e to 6908732 Compare April 23, 2024 00:11
@jayzhan211
Copy link
Contributor Author

jayzhan211 commented Apr 23, 2024

I created an ALL-in-one PR in #10188 to know what it is like after moving all the create-expr function

#10176 is the second one that includes this PR.
#10188 includes #10176

@alamb alamb dismissed their stale review April 25, 2024 14:34

I think I am missing something -- see #10188 (review) for details

@jayzhan211 jayzhan211 marked this pull request as draft April 29, 2024 08:16
@jayzhan211 jayzhan211 closed this May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants