Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move Log2, Log10, Ln to datafusion-functions #9869

Merged
merged 4 commits into from
Mar 30, 2024

Conversation

tinfoil-knight
Copy link
Contributor

Which issue does this PR close?

Closes #9868.

Rationale for this change

As part of #9285, log10, log2 & ln functions should be migrated to the datafusion-functions crate.

Sidenote:
Didn't migrate log because some parts (eg: simplifying expressions: see simpl_log, simpl_power) of log & power functions are dependent on each other & IMO these should be migrated together.

What changes are included in this PR?

Extend the make_math_unary_udf to accept a parameter for monotonicity & migrate log10, log2, ln from built-in functions to the datafusion-functions crate.

Are these changes tested?

Covered by existing tests.

Are there any user-facing changes?

No.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Mar 29, 2024
@tinfoil-knight tinfoil-knight changed the title move log2 move Log2, Log10, Ln to datafusion-functions Mar 29, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @tinfoil-knight -- 🚀

Didn't migrate log because some parts (eg: simplifying expressions: see simpl_log, simpl_power) of log & power functions are dependent on each other & IMO these should be migrated together.

I agree -- thank you, this makes sense

Note I think we can implement simpl_log using ScalarUDFImpl::simplify (for example https://github.com/apache/arrow-datafusion/blob/9d47dcaab85294f517c899464c442f9a4518b6f2/datafusion/functions/src/datetime/current_date.rs#L76-L91)

It might be a fun little PR if you wanted to try

make_math_unary_udf!(AcosFunc, ACOS, acos, acos);
make_math_unary_udf!(AsinFunc, ASIN, asin, asin);
make_math_unary_udf!(TanFunc, TAN, tan, tan);
make_math_unary_udf!(Log2Func, LOG2, log2, log2, Some(vec![Some(true)]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could make a constant or something that would make this easier to read

For example

/// Single monotonic argument
const MONOTONIC_ONE: Option<Vec<bool>> = Some(vec![Some(true)])
const NON_MONOTONIC: Option<Vec<bool>> = None;

And then

Suggested change
make_math_unary_udf!(Log2Func, LOG2, log2, log2, Some(vec![Some(true)]));
make_math_unary_udf!(Log2Func, LOG2, log2, log2, MONOTONIC_ONE);

Perhaps as a follow on PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb
I tried this readability fix but Rust doesn't allow allocation (which vec![] needs) in const or static items. Global let is also not allowed.

The compiler suggested this with static:

consider wrapping this expression in `Lazy::new(|| ...)` from the `once_cell` crate: https://crates.io/crates/once_cell

We use once_cell in the physical_plan crate but introducing it to datafusion functions just for this doesn't seem right to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use once_cell in the physical_plan crate but introducing it to datafusion functions just for this doesn't seem right to me.

That makes sense to me. We would probably have to throw in a clone() or something

I thought about this and wrote up another API that would be much nicer #9879 (both for this callsite and montonicity in general)

@alamb alamb merged commit 078aeb6 into apache:main Mar 30, 2024
23 checks passed
@tinfoil-knight tinfoil-knight deleted the 9868-log-port branch March 30, 2024 15:21
@tinfoil-knight
Copy link
Contributor Author

tinfoil-knight commented Mar 30, 2024

Note I think we can implement simpl_log using ScalarUDFImpl::simplify

Yeah. I'll try this out.

Edit:
I've opened #9875 to track the migration for Power & Log.

Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Apr 1, 2024
* move log2

* move log10, ln

* refactor log_b functions to use macro

* update proto
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

move the Log2, Log10, Ln functions to datafusion-functions crate
2 participants