-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for "week of month" in date parsing pattern #10511
Add support for "week of month" in date parsing pattern #10511
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@PHILO-HE @mbasmanova Could you help to review this PR please? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you add some unit tests? Thanks.
@@ -56,6 +56,13 @@ DateTimeFormatterBuilder& DateTimeFormatterBuilder::appendWeekOfWeekYear( | |||
return *this; | |||
} | |||
|
|||
DateTimeFormatterBuilder& DateTimeFormatterBuilder::appendWeekOfMonth( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this function be called in buildJodaDateTimeFormatter or buildMysqlDateTimeFormatter? Like https://github.com/facebookincubator/velox/blob/main/velox/functions/lib/DateTimeFormatter.cpp#L1463.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NEUpanning As @rui-mo mentioned, it would be nice to add some tests. Some specific suggestions are below. Thanks.
@@ -107,6 +107,15 @@ Status daysSinceEpochFromWeekDate( | |||
int32_t dayOfWeek, | |||
int64_t& out); | |||
|
|||
/// Computes the (signed) number of days since unix epoch (1970-01-01). | |||
/// Returns UserError status if the date is invalid. | |||
Status daysSinceEpochFromWeekOfMonthDate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, extract this function into a separate PR and add a unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DateTimeFormatter.parse function depends on this function. Do you mean to submit all changes to the DateTimeFormatter.parse function and this function as a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this comment suggest us opening another PR which contains the implementation and tests for daysSinceEpochFromWeekOfMonthDate
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a PR for this. #10604
int32_t weekOfMonth, | ||
int32_t dayOfWeek, | ||
int64_t& out) { | ||
if (!isValidWeekOfMonthDate(year, month, weekOfMonth, dayOfWeek)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isValidWeekOfMonthDate doesn't perform full validation. For example, it allows weekOfMonth = 5 and dayOfWeek = 6, but these may not exist for all months. Where / how do we handle these cases of invalid inputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested SimpleDateFormat and I see it can parse the "invalid" weekOfMonth and dayOfWeek etc. Perhaps we should remove the validation to align with SimpleDateFormat
. Here is an example:
SimpleDateFormat sdf = new SimpleDateFormat("y W");
sdf.setTimeZone(TimeZone.getTimeZone("UTC"));
Date date = sdf.parse("2024 1");
System.out.println(date);
date = sdf.parse("2024 5");
System.out.println(date);
// invalid week of month
date = sdf.parse("2024 99");
System.out.println(date);
-------------console output--------------
Sun Dec 31 00:00:00 CST 2023
Sun Jan 28 00:00:00 CST 2024
Sun Nov 16 00:00:00 CST 2025
@mbasmanova @rui-mo I've added some unit tests in 7ec5bf4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@@ -735,7 +740,6 @@ int32_t parseFromPattern( | |||
return -1; | |||
} | |||
cur += size; | |||
date.weekDateFormat = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we need this change, and could you clarify what is covered by weekDateFormat
which seems to be a larger scope than weekOfMonthDateFormat
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weekDateFormat
has different format with weekOfMonthDateFormat
.weekDateFormat
is composed of year, weekOfYear and dayOfWeek and weekOfMonthDateFormat
is composed of year, month, weekOfMonth and dayOfWeek. dayOfWeek is used by both weekDateFormat
and weekOfMonthDateFormat
, so dayOfWeek can't indicate the format is weekDateFormat
int32_t year, | ||
int32_t month, | ||
int32_t weekOfMonth, | ||
int32_t dayOfWeek) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some unit tests for this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added tests in e6d51ae
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NEUpanning I notice some comments seems to be unresolved. Would you take a look first? Thanks.
#10511 (review) and #10511 (review)
@rui-mo The two comments you mentioned, one said "added test" and the other said "thanks". I have already added the test. Are there any other comments I missed? |
@NEUpanning Could you check the specific comments below them, e.g #10511 (comment)? If they have been addressed, it would be nice to reply at the comment section. |
@rui-mo Sorry. I haven't submitted comments in a while. I think you can see it now. |
…ek-of-month date (#10604) Summary: This helper function is only used by Spark. To align with Spark's SimpleDateFormat behavior, this function offers two modes: lenient and non-lenient. For non-lenient mode, it returns an error status if the date is invalid. For lenient mode, it accepts a wider range of arguments. Part of #10511 Pull Request resolved: #10604 Reviewed By: xiaoxmeng Differential Revision: D62033315 Pulled By: bikramSingh91 fbshipit-source-id: f17e50c07272dc656038b221129aff6882bcf02c
Opened a new PR #11103 for this feature. |
In date parsing pattern, "week of month" is used by
java.text.SimpleDateFormat
. So we need to support it for supporting SimpleDateFormat date parsing mode in DateTimeFormatter. It's presented by number and the range is [1,5].relates issue:#6374