-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support WEEK_YEAR for date time formatter #10930
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
static_cast<int>(calDate.year()), | ||
static_cast<uint32_t>(calDate.month()), | ||
static_cast<uint32_t>(calDate.day()), | ||
2, // (ISO 8601) Monday = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering how we plan to support the legacy Spark behavior where this value should be 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it can be obtained from java.util.Calendar
in Java and then passed in through QueryConf
.
@rui-mo I have updated the code. Could you please review it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
velox/functions/lib/TimeUtils.h
Outdated
/// (1582-10-15) would yield mismatched results. | ||
/// | ||
/// The algorithm refers to the weekyear algorithm in jdk: | ||
/// https://github.com/openjdk/jdk8/blob/6a383433a9f4661a96a90b2a4c7b5b9a85720031/jdk/src/share/classes/java/util/GregorianCalendar.java#L2077 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link points to an implementation in JDK 8. Is there any difference in the implementations among JDK versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no difference. I will updated the URL to the most recent openjdk. https://github.com/openjdk/jdk/blob/d9c67443f7d7f03efb2837b63ee2acc6113f737f/src/java.base/share/classes/java/util/GregorianCalendar.java#L2058
032c621
to
31bb656
Compare
@rui-mo I have updated the code. Could you please review it? Presto test case is generated by the following query on presto 0.289: with
dates as (select
date_parse(cast(year as varchar) || '-' || cast(month as varchar) || '-' || cast(day as varchar), '%Y-%m-%d') as d
from
(select * from unnest(sequence(2017, 2022)) as years(year)),
(select * from (values
(1, 1),
(1, 2),
(1, 3),
(1, 4),
(1, 5),
(1, 6),
(1, 7),
(12, 25),
(12, 26),
(12, 27),
(12, 28),
(12, 29),
(12, 30),
(12, 31)
) as monthdays(month, day))
)
select date_format(d, '%Y-%m-%d'), date_format(d, '%x') from dates
where day(d) in (1, 31) or year(d) != year_of_week(d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
/// `minimalDaysInFirstWeek` is 1. | ||
/// | ||
/// The algorithm refers to the getWeekYear algorithm in openjdk: | ||
/// https://github.com/openjdk/jdk/blob/d9c67443f7d7f03efb2837b63ee2acc6113f737f/src/java.base/share/classes/java/util/GregorianCalendar.java#L2058 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ccat3z @rui-mo is this different from the vendored date library we have at:
https://github.com/facebookincubator/velox/blob/main/velox/external/date/iso_week.h
we use it's iso_week functionality in a few places already. If it's the same logic, it would be nice to consolidate the usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iso_week only support ISO 8601 standard week year (iso_week.h#L1519). The purpose of this implementation is to support non-ISO standard week year, which is required by Spark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@assignUser @ccat3z @rui-mo I asked for guidance from license experts on this. Give me a day or two until they get back to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pedroerp Any updates?
5d7e994
to
174b4bc
Compare
@rui-mo I have updated the support for simple date time formatter. Could you please review it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating. Just some nits.
private: | ||
std::unique_ptr<char[]> literalBuf_; | ||
size_t bufSize_; | ||
std::vector<DateTimeToken> tokens_; | ||
DateTimeFormatterType type_; | ||
|
||
/// The first day-of-week varies by culture. | ||
/// firstDayOfWeek is is a 1-based weekday number starting with Sunday. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "is is" repeated
@rui-mo I have updated the code. Could you please review it again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good to me overall % we might need community's feedback on #10930 (comment).
Support WEEK_YEAR for date time formatter.
getWeekYear()
is cloned from jdk 8 supporting both ISO8601 and Java SimpleDateFormat standards.Test case is generated by the scala script below.
Presto test case is generated by the following query on presto 0.289:
SparkSQL test case is generated by the following query on spark 3.5.2: