-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release timezone and calendar features in DatapointsAPI (beta) #1779
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1779 +/- ##
==========================================
- Coverage 92.95% 92.94% -0.02%
==========================================
Files 121 121
Lines 17547 17673 +126
==========================================
+ Hits 16311 16426 +115
- Misses 1236 1247 +11
|
8908885
to
a7286f4
Compare
dc1c766
to
3afe935
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main comment is that I vote for allowing string as input for input to API method.
Otherwise, there is some refactoring which seems unrelated to the new functionality (although I guess implicitly related) which makes it hard to get an overview. But the testing is good, so I trust the changes.
cognite/client/_api/datapoints.py
Outdated
>>> df = client.time_series.data.retrieve_dataframe_in_tz( | ||
... external_id=["foo", "bar"], | ||
... aggregates=["sum", "continuous_variance"], | ||
... granularity="1quarter", | ||
... start=datetime(2020, 1, 1, tzinfo=ZoneInfo("America/New_York")), | ||
... end=datetime(2022, 12, 31, tzinfo=ZoneInfo("America/New_York"))) | ||
""" | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see that this is no longer necessary :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to warn when this is still in beta? I'm not sure 🤔
@@ -295,7 +316,8 @@ def to_payload_item(self) -> _DatapointsPayloadItem: | |||
payload["ignoreBadDataPoints"] = self.ignore_bad_datapoints | |||
if self.treat_uncertain_as_bad is False: | |||
payload["treatUncertainAsBad"] = self.treat_uncertain_as_bad | |||
|
|||
if self.timezone: | |||
payload["timeZone"] = self.timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? My IDE complains that self.timezone is an object, while payload["timeZone"] expects str or None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have made an api_friendly_timezone
parameter instead of overwriting the existing one, but then the fetching logic must use that one instead of the logical one (timezone
), so I vouch to keep it. Pleeeenty of tests that cover all possible inputs.
a337bd9
to
a5498ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of code here, but looks great to me. That said, I'm not into python, and I haven't run any of these, so my approval is partially a rubber stamp. Some nitty comments though, may act on them, or not :)
cognite/client/_api/datapoints.py
Outdated
aggregates (Aggregate | str | list[Aggregate | str] | None): Single aggregate or list of aggregates to retrieve. Default: None (raw datapoints returned) | ||
granularity (str | None): The granularity to fetch aggregates at. e.g. '15s', '2h', '10d'. Default: None. | ||
granularity (str | None): The granularity to fetch aggregates at. e.g. '15s', '2h', '10d'. The unit can be spelled out for clarity, e.g., second(s), minute(s), hour(s), day(s), week(s), month(s), quarter(s), or year(s). Default: None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you write the full name, minute(s) and month(s), and you imply that there is an abbreviation. The docs doesn't say what these are though (in particular m/mo as abbreviations for minute/month). Maybe it's easier if you add examples between s and h, and after d?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! I'll add a separate section on available granularities, then keep the docstring "short and simple"!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to:
granularity (str | None): The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity: s/second(s)
, m/minute(s)
, h/hour(s)
, d/day(s)
, w/week(s)
, mo/month(s)
,
q/quarter(s)
, or y/year(s)
. Examples: 30s
, 5m
, 1day
, 2weeks
. Default: None.
cognite/client/_api/datapoints.py
Outdated
@@ -1015,34 +1045,35 @@ def retrieve_dataframe( | |||
|
|||
if not uniform_index: | |||
return fetcher.fetch_all_datapoints_numpy().to_pandas( | |||
column_names, include_aggregate_name, include_granularity_name | |||
column_names, include_aggregate_name, include_granularity_name, include_status=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to be on the safe side, is it intentional to set include_status=True? Not include_status=include_status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the conversion step to pandas which only converts when status codes are available. That said, I agree we can just pass include_status=include_status
instead
"sec": "s", | ||
"second": "s", | ||
"seconds": "s", | ||
"t": "m", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what is the "t" granularity stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In "pandas world", T has always been minutes (as m was month)
return timezone.utc | ||
elif re.match(r"^(-|\+)\d\d?$", tz) and abs(hours_offset := int(tz)) <= 18: | ||
return timezone(timedelta(hours=hours_offset)) | ||
return cast(timezone, datetime.strptime(tz, "%z").tzinfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't run this code in my head, so just want to check, we do support UTC+05:30 and similar? (Is that what datetime.strptime does?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is only called when we couldn't parse time zones like "Europe/Oslo", so that we don't have to handle those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, "%z"
parses any UTC offset, so I'm relying as much as possible on built-in datetime parsing
("", timedelta(0)), | ||
("01:15", timedelta(seconds=4500)), | ||
("01:15:12", timedelta(seconds=4512)), | ||
("23:59", timedelta(seconds=86340)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have disallowed (400 error) fixed offsets that are not multiple of 15 minutes. Not sure if this matters to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We parse it and just pass it on to the API (and let it decide/raise) - this way the SDK is future-proof in case you ever decide to be more lenient with offsets 😄
…t just converted ms)
a5498ff
to
7f37fd3
Compare
Builds upon #1774
[7.45.0] - 2024-05-28
Added
timezone
and new calendar-based granularities likemonth
,quarter
andyear
.These API features are in beta, and the SDK implementation in alpha, meaning breaking changes can
happen without warning. Set beta header to avoid warning. Users of
retrieve_dataframe_in_tz
shouldconsider upgrading as soon as possible.