Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow partial date parsing when simple datetime formatter is used #11386

Closed

Conversation

NEUpanning
Copy link
Contributor

@NEUpanning NEUpanning commented Oct 30, 2024

The Spark legacy datetime formatter allows parsing date from incomplete text,
seeing code link. This PR enables partial date parsing when the LENIENT_SIMPLE
or STRICT_SIMPLE datetime formatter is used.

Relates issues: #10354, gluten#6227

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 30, 2024
Copy link

netlify bot commented Oct 30, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 945864e
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6735de06d569060008746b1c

@NEUpanning
Copy link
Contributor Author

@rui-mo Could you please take a look? Thanks!

Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

velox/functions/lib/DateTimeFormatter.cpp Outdated Show resolved Hide resolved
velox/functions/lib/tests/DateTimeFormatterTest.cpp Outdated Show resolved Hide resolved
@NEUpanning NEUpanning changed the title Allow partial input date parsing when legacy datetime formatter is used Allow partial input date parsing when simple datetime formatter is used Nov 4, 2024
@NEUpanning NEUpanning requested a review from rui-mo November 4, 2024 06:20
Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

velox/functions/lib/DateTimeFormatter.cpp Outdated Show resolved Hide resolved
@rui-mo rui-mo changed the title Allow partial input date parsing when simple datetime formatter is used Allow partial date parsing when simple datetime formatter is used Nov 4, 2024
@NEUpanning
Copy link
Contributor Author

@pedroerp Could you please take a look? Thanks!

@NEUpanning
Copy link
Contributor Author

@mbasmanova Could you please take a look? Thanks!

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NEUpanning Thank you for the change. I assume this is a change in user-visible behavior. Would you update the docs?

@NEUpanning NEUpanning force-pushed the allow_not_use_entire_string branch from e3f74e6 to 4b2e9aa Compare November 12, 2024 11:49
@NEUpanning
Copy link
Contributor Author

@mbasmanova Thank you for your quick feedback. This change impacts the behavior of unix_timestamp and get_timestamp which relied on DatetimeFormmater::parse. I've added docs and rebased it onto the main branch.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NEUpanning Thank you for iterating. Looks great.

@@ -337,3 +325,33 @@ These functions support TIMESTAMP and DATE input types.
part of the 53rd week of year 2004, so the result is 2004. Only supports DATE type.

SELECT year_of_week('2005-01-02'); -- 2004

Different Behaviors Between Simple And Joda Date Formmaters
--------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think --- needs to go all the way to cover the length of the title.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed

@@ -337,3 +325,33 @@ These functions support TIMESTAMP and DATE input types.
part of the 53rd week of year 2004, so the result is 2004. Only supports DATE type.

SELECT year_of_week('2005-01-02'); -- 2004

Different Behaviors Between Simple And Joda Date Formmaters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Simple vs. Joda Date Formatter

--------------------------------

To align with Spark, Velox supports both `Simple <https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html>`_
and `Joda <https://www.joda.org/joda-time/>`_ date formmater to parse/format timestamp/date strings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formmater -> formmaters


To align with Spark, Velox supports both `Simple <https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html>`_
and `Joda <https://www.joda.org/joda-time/>`_ date formmater to parse/format timestamp/date strings
on the functions `from_unixtime`, `unix_timestamp`, `make_date` and `to_unix_timestamp`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the functions -> used in functions

for functions, use references

:spark:func:`from_unixtime`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Nov 14, 2024
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in 5b274e3.

Copy link

Conbench analyzed the 1 benchmark run on commit 5b274e3c.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants