Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove pydantic constraint <2.10.0 and update timedelta validator, serializer type hints #757

Merged
merged 11 commits into from
Dec 5, 2024

Conversation

Pijukatel
Copy link
Contributor

Seems like new 2.10.0 version made some changes that made our type hints cause errors. On the other hand, it seems our type hints were too generic. Making them more specific fixes the issue in new pydantic versions.

Seems like new 2.10.0 version made some changes that made our type hints cause errors.
On the other hand, it seems our type hints were too generic. Making them more specific fixes the issue.
@Pijukatel Pijukatel added the debt Code quality improvement or decrease of technical debt. label Nov 28, 2024
@github-actions github-actions bot added this to the 103rd sprint - Tooling team milestone Nov 28, 2024
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Nov 28, 2024
def _timedelta_from_ms(value: float | timedelta | Any | None, handler: Callable[[Any], Any]) -> Any:
def _timedelta_from_ms(
value: float | timedelta | Any | None, handler: Callable[[Any], timedelta | None]
) -> timedelta | None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as _timedelta_to_ms comment

@@ -9,7 +9,7 @@
"""Utility types for Pydantic models."""


def _timedelta_to_ms(td: timedelta | None) -> Any:
def _timedelta_to_ms(td: timedelta | None) -> float | None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be more strict and require float, but I have a feeling, that None could be used for something connected with defaults somewhere in code???

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can consider using only floats.

However, in StatisticsState and StatisticsPersistedState, we have fields like this:

request_min_duration: Annotated[timedelta_ms | None, Field(alias='requestMinDurationMillis')] = None

Make sure it won't break them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding some tests (but not sure whether I should keep them) and I think it works as expected. I am not really so sure why the previous implementation allowed any in so many places in this file, so I am little afraid there is some non obvious edge case I could not think of...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks. Let's wait for what @janbuchar thinks about that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason not to support | None in the serializer? Does it break pydantic >2.10? If not, I'd keep it, just in case - it looks like pydantic 2.10 doesn't call the serializer for None values, but that might not be the case for older versions which we also claim to support.

Copy link
Contributor Author

@Pijukatel Pijukatel Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that can also break pydantic. See new commit that allows | None, but has to use Union due to known issues that Pydantic can have with from __future__ import annotations and | symbol not being available for older Python versions in type hints...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that the pipe syntax will work with Pydantic 3.9, for example?

@Pijukatel Pijukatel marked this pull request as ready for review November 28, 2024 09:26
@Pijukatel Pijukatel requested a review from janbuchar November 28, 2024 09:26
Add tests for Models using this type.
@github-actions github-actions bot added the tested Temporary label used only programatically for some analytics. label Nov 28, 2024
@Pijukatel Pijukatel requested a review from vdusek November 28, 2024 12:26
Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for Honza's approval as well

Copy link
Collaborator

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor things, otherwise it's great!

Copy link
Collaborator

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the last two nits 🙂

(3.5, timedelta(microseconds=3500), 4),
(3.99, timedelta(microseconds=3990), 4),
(None, None, None),
(float('inf'), timedelta(days=999999999, seconds=3600 * 24 - 1, microseconds=999999), float('inf')),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered 3600 * 24 - 1 to be more readable than 86399. Since it is just a test, I avoided defining SECONDS_IN_HOUR and HOURS_IN_DAYS ....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for avoiding that 🙂 I usually do timedelta(days=3, hours=8).total_seconds() instead.

Edit pyproject toml to ensure that the broken versions are forbidden.
Use | insted of union and bring back from future import annotations
@Pijukatel Pijukatel marked this pull request as ready for review December 5, 2024 10:53
@Pijukatel Pijukatel requested a review from janbuchar December 5, 2024 11:24
Versions 2.6.0-2.8.0 are failing our unit tests
@Pijukatel Pijukatel merged commit c0050c0 into master Dec 5, 2024
23 checks passed
@Pijukatel Pijukatel deleted the relax-pydantic-constraint branch December 5, 2024 13:05
Mantisus pushed a commit to Mantisus/crawlee-python that referenced this pull request Dec 10, 2024
…r, serializer type hints (apify#757)

Pydantic version 2.10.0 made some changes that broke our code. This was fixed in 2.10.3

Also our type hints were too generic in models.py. This change makes them more specific.
Older versions of Pydantic 2.6.0-2.8.0 caused errors in our codebase and so minimal version of Pydantic was bumped to 2.8.1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debt Code quality improvement or decrease of technical debt. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants