Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release timezone and calendar features in DatapointsAPI (beta) #1779

Merged
merged 47 commits into from
May 31, 2024

Conversation

haakonvt
Copy link
Contributor

@haakonvt haakonvt commented May 23, 2024

Builds upon #1774

[7.45.0] - 2024-05-28

Added

  • DatapointsAPI now support timezone and new calendar-based granularities like month, quarter and year.
    These API features are in beta, and the SDK implementation in alpha, meaning breaking changes can
    happen without warning. Set beta header to avoid warning. Users of retrieve_dataframe_in_tz should
    consider upgrading as soon as possible.

Copy link

codecov bot commented May 23, 2024

Codecov Report

Attention: Patch coverage is 91.20879% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 92.94%. Comparing base (00990e7) to head (756a4f6).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1779      +/-   ##
==========================================
- Coverage   92.95%   92.94%   -0.02%     
==========================================
  Files         121      121              
  Lines       17547    17673     +126     
==========================================
+ Hits        16311    16426     +115     
- Misses       1236     1247      +11     
Files Coverage Δ
cognite/client/_api/datapoints.py 97.74% <100.00%> (-0.14%) ⬇️
cognite/client/_version.py 100.00% <100.00%> (ø)
cognite/client/utils/__init__.py 100.00% <ø> (+11.11%) ⬆️
cognite/client/utils/_importing.py 91.66% <ø> (ø)
cognite/client/_api/datapoint_tasks.py 95.21% <96.55%> (-0.36%) ⬇️
cognite/client/utils/_pandas_helpers.py 85.48% <85.00%> (-0.88%) ⬇️
cognite/client/utils/_time.py 94.86% <87.50%> (+0.05%) ⬆️
cognite/client/data_classes/datapoints.py 91.72% <88.00%> (+0.30%) ⬆️

... and 3 files with indirect coverage changes

@haakonvt haakonvt force-pushed the swap-dps-tz-to-api branch 2 times, most recently from 8908885 to a7286f4 Compare May 24, 2024 23:47
@haakonvt haakonvt force-pushed the swap-dps-tz-to-api branch 2 times, most recently from dc1c766 to 3afe935 Compare May 28, 2024 10:42
@haakonvt haakonvt changed the title [DONT MERGE] Release timezone and calendar features in DatapointsAPI Release timezone and calendar features in DatapointsAPI (beta) May 28, 2024
@haakonvt haakonvt marked this pull request as ready for review May 28, 2024 10:43
@haakonvt haakonvt requested review from a team as code owners May 28, 2024 10:43
@haakonvt haakonvt requested review from mattmurph9 and removed request for a team May 28, 2024 10:43
Copy link
Contributor

@doctrino doctrino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main comment is that I vote for allowing string as input for input to API method.

Otherwise, there is some refactoring which seems unrelated to the new functionality (although I guess implicitly related) which makes it hard to get an overview. But the testing is good, so I trust the changes.

cognite/client/_api/datapoints.py Outdated Show resolved Hide resolved
>>> df = client.time_series.data.retrieve_dataframe_in_tz(
... external_id=["foo", "bar"],
... aggregates=["sum", "continuous_variance"],
... granularity="1quarter",
... start=datetime(2020, 1, 1, tzinfo=ZoneInfo("America/New_York")),
... end=datetime(2022, 12, 31, tzinfo=ZoneInfo("America/New_York")))
"""
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see that this is no longer necessary :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to warn when this is still in beta? I'm not sure 🤔

@@ -295,7 +316,8 @@ def to_payload_item(self) -> _DatapointsPayloadItem:
payload["ignoreBadDataPoints"] = self.ignore_bad_datapoints
if self.treat_uncertain_as_bad is False:
payload["treatUncertainAsBad"] = self.treat_uncertain_as_bad

if self.timezone:
payload["timeZone"] = self.timezone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? My IDE complains that self.timezone is an object, while payload["timeZone"] expects str or None

Copy link
Contributor Author

@haakonvt haakonvt May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have made an api_friendly_timezone parameter instead of overwriting the existing one, but then the fetching logic must use that one instead of the logical one (timezone), so I vouch to keep it. Pleeeenty of tests that cover all possible inputs.

cognite/client/data_classes/datapoints.py Outdated Show resolved Hide resolved
@doctrino doctrino self-requested a review May 29, 2024 09:59
@haakonvt haakonvt force-pushed the swap-dps-tz-to-api branch from a337bd9 to a5498ff Compare May 30, 2024 07:25
Copy link
Contributor

@matiascognite matiascognite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of code here, but looks great to me. That said, I'm not into python, and I haven't run any of these, so my approval is partially a rubber stamp. Some nitty comments though, may act on them, or not :)

aggregates (Aggregate | str | list[Aggregate | str] | None): Single aggregate or list of aggregates to retrieve. Default: None (raw datapoints returned)
granularity (str | None): The granularity to fetch aggregates at. e.g. '15s', '2h', '10d'. Default: None.
granularity (str | None): The granularity to fetch aggregates at. e.g. '15s', '2h', '10d'. The unit can be spelled out for clarity, e.g., second(s), minute(s), hour(s), day(s), week(s), month(s), quarter(s), or year(s). Default: None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you write the full name, minute(s) and month(s), and you imply that there is an abbreviation. The docs doesn't say what these are though (in particular m/mo as abbreviations for minute/month). Maybe it's easier if you add examples between s and h, and after d?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! I'll add a separate section on available granularities, then keep the docstring "short and simple"!

Copy link
Contributor Author

@haakonvt haakonvt May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to:

granularity (str | None): The granularity to fetch aggregates at. Can be given as an abbreviation or spelled out for clarity: s/second(s), m/minute(s), h/hour(s), d/day(s), w/week(s), mo/month(s),
q/quarter(s), or y/year(s). Examples: 30s, 5m, 1day, 2weeks. Default: None.

@@ -1015,34 +1045,35 @@ def retrieve_dataframe(

if not uniform_index:
return fetcher.fetch_all_datapoints_numpy().to_pandas(
column_names, include_aggregate_name, include_granularity_name
column_names, include_aggregate_name, include_granularity_name, include_status=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be on the safe side, is it intentional to set include_status=True? Not include_status=include_status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the conversion step to pandas which only converts when status codes are available. That said, I agree we can just pass include_status=include_status instead

"sec": "s",
"second": "s",
"seconds": "s",
"t": "m",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what is the "t" granularity stand for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "pandas world", T has always been minutes (as m was month)

return timezone.utc
elif re.match(r"^(-|\+)\d\d?$", tz) and abs(hours_offset := int(tz)) <= 18:
return timezone(timedelta(hours=hours_offset))
return cast(timezone, datetime.strptime(tz, "%z").tzinfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't run this code in my head, so just want to check, we do support UTC+05:30 and similar? (Is that what datetime.strptime does?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is only called when we couldn't parse time zones like "Europe/Oslo", so that we don't have to handle those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, "%z" parses any UTC offset, so I'm relying as much as possible on built-in datetime parsing

("", timedelta(0)),
("01:15", timedelta(seconds=4500)),
("01:15:12", timedelta(seconds=4512)),
("23:59", timedelta(seconds=86340)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have disallowed (400 error) fixed offsets that are not multiple of 15 minutes. Not sure if this matters to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We parse it and just pass it on to the API (and let it decide/raise) - this way the SDK is future-proof in case you ever decide to be more lenient with offsets 😄

@haakonvt haakonvt force-pushed the swap-dps-tz-to-api branch from a5498ff to 7f37fd3 Compare May 31, 2024 07:29
@haakonvt haakonvt enabled auto-merge (squash) May 31, 2024 13:14
@haakonvt haakonvt merged commit bb76e95 into master May 31, 2024
12 checks passed
@haakonvt haakonvt deleted the swap-dps-tz-to-api branch May 31, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants