-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update to pydantic 2 #625
update to pydantic 2 #625
Conversation
@thomas-maschler , moving the STAC types away from the TypeDict implementations in stac_fastapi.types to Pydantic types could have performance implications. Specifically I'm worried about the STAC types encoding Client output. These were originally moved away from Pydantic models as they tended to be slow for large payloads, e.g. returning an ItemCollection from search that contains 1000 large Sentinel 2 Items. Running Pydantic validation against known valid Items that come out of PgSTAC as part of response causes additional latency without a lot of value. I know the Pydantic 2 backend changes means validation will be faster, but are we introducing the potential for increased latency in the API routes because of the move away from TypedDicts in e.g. BaseCoreClient in this PR? If so, could the upgrade be done for but leave the TypeDict usage as is? Or have you seen that these changes don't introduce additional latency due to the magic of Rust 🪄? |
I can see your concerns regarding latency. I did not benchmark this myself. Others reported a speed improvement of x5 between pydantic 1 and 2. I was assuming that this would be enough to address latency concerns. The benefits I saw in keeping the pydandic models are peace of mind, due to basic checks and setting of default values you wouldn't have to store in the DB. I have a use case, where data are stored in a legacy system and I have to translate them to STAC at runtime. Being able to push part of this into Pydantic makes the job easier. If latency is still the main concern I could see several routes
from pydantic import SkipValidation
from stac_pydantic import api
from typing import Sequence
class ItemCollectionUnsafe(api.ItemCollection):
features: Sequence[SkipValidation[api.Item]]
class CollectionsUnsafe(api.Collections):
collections: Sequence[SkipValidation[api.Collection]] With minimal code changes, users can use the
|
Couple of notes from my pull request:
|
@rhysrevans3, regarding your first point, can you give me an example of when this fails? I cannot replicate it. I tried this, where the filter attribute is defined as @pytest.mark.parametrize(
"filter,passes",
[(None, True), ({"test": "test"}, True), ("test==test", False), ([], False)],
)
def test_create_post_request_model(filter, passes):
extensions = [FilterExtension()]
request_model = create_post_request_model(extensions, BaseSearchPostRequest)
if not passes:
with pytest.raises(ValidationError):
model = request_model(filter=filter)
else:
model = request_model(
collections=["test1", "test2"],
ids=["test1", "test2"],
bbox=[0, 0, 1, 1],
datetime="2020-01-01T00:00:00Z",
limit=10,
filter=filter,
**{"filter-crs": "epsg:4326", "filter-lang": "cql2-text"},
)
assert model.collections == ["test1", "test2"]
assert model.filter_crs == "epsg:4326"
assert model.filter == filter |
@thomas-maschler understood about the value of validation in the response classes for other use cases. I think a solution where users could skip validation based on configuration would be great. In my previous testing (a year or so ago) it was the Pydantic validation that was causing slowdowns. I'd still want to validate that there's no other performance hits besides validation for using Pydantic models over TypeDicts before integrating these changes, but I have enough confidence that this would address the performance implications, and any unforeseen issues could be addressed in follow up work. Would you be willing to add that SkipValidation configuration to this PR? |
about performances, in TiPg we also went for the |
@thomas-maschler I found it was an issue for nested List/Dicts types: for example the
But when I check the
|
@rhysrevans3 this is b/c we didn't set a default value. Right now the model requires either Regarding your point 2. Regarding your point 3. I am still working on the switch between |
@thomas-maschler ah that makes more sense thank you for the explanation. I guess this just need to be filled in the backend I'm using. Point 2 that sounds like a good idea. But don't the links in the api.landing_page used here need to be stac-pydantic Point 3 sorry yes I meant |
@lossyrob can you take a look at my recent changes, in particular 8118f10.
I stumbled over some other bugs regarding field aliases in the filter extension that I didn't manage to fix. I will file an issue for that. |
@vincentsarago I thought we had results here that seemed to make sense? |
I believe we now test for correct validation. |
@thomas-maschler Yes
@jonhealy1 absolutely, so this mean that validation does not impact the performance 🤯. I would still keep this optional but it might be great to mention this somewhere |
Ok, I'll update the readme |
thanks @thomas-maschler I was going update this branch tomorrow morning 🙏 let see if we can get this merged before the end of the week 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. The readme looks great - just a couple of spelling things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 thanks @thomas-maschler
Notes:
- we should wait for the next stac-pydantic release with Adjust ItemProperties Validation. stac-pydantic#131
- I've created a
maint-2.x
branch (https://github.com/stac-utils/stac-fastapi/tree/maint-2.x) if we need to continue updatingold
stac-fastapi
@vincentsarago when you say wait do you mean wait to merge? |
@jonhealy1 sorry I meant, wait for the 3.0 release :-) I'll merge this now 🥳 |
Related Issue(s):
Description:
This PR updates dependencies to Pydantic 2 and Stac-Pydantic 3
It removes the internal Stac, Search and Opertator Types in favor of Types from Stac Pydantic
PR Checklist:
pre-commit
hooks pass locallymake test
)make docs
)