Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable statistics endpoint for mosaics #125

Closed
wants to merge 6 commits into from

Conversation

ividito
Copy link
Collaborator

@ividito ividito commented Oct 19, 2022

What

  • Mosaics now support the /statistics endpoint - resolves Enable mosaic based stats requests #70.
  • Docker-compose now supports hot-reload for changes within their respective /runtime/src/ folders
  • Local Dockerfiles use the same Python version as our hosted lambdas
  • pygeoif constrained - the new version released in late September breaks other dependencies and prevents the STAC api from running.

Why

  • Local development was a bit cumbersome, with at least one error on start and a chunk of time needed to see changes after making them.
  • This makes it possible to generate stats for an AOI in a single step (as opposed to retrieving a list of COGs within the AOI and then requesting the stats for each).

How tested

Manually tested locally on a handful of mosaics defined in the workflows folder.

  • TODO Add a reasonable automated test for statistics endpoint and test on hosted dev stack

@ividito
Copy link
Collaborator Author

ividito commented Oct 24, 2022

Currently blocked - requests to the new endpoint are taking several minutes to resolve, which exceeds the AWS timeout limit.

Related profiling/XRay writeup - #126

@ividito ividito force-pushed the feature/statistics-endpoint-on-mosaics branch from 142b5d0 to 01c52e5 Compare November 24, 2022 17:04
@ividito
Copy link
Collaborator Author

ividito commented Nov 25, 2022

image
I profiled the endpoint call from the test in this PR - there's a massive performance loss from the geojson_statistics call in titiler-pgstac. I'll dig a bit more into it, but I'm wondering whether it makes more sense to have our endpoint return a list of statistics from the individual COGs in an AOI, rather than try to recalculate them, which seems to be the issue currently.

@ividito ividito marked this pull request as draft November 28, 2022 18:20
@anayeaye
Copy link
Collaborator

anayeaye commented Dec 7, 2022

Dropping a note here from a recent discussion:
The /stac/statistics endpoint may not be performant as is for requests that do not have enough filtering to limit the number of results.

For example, we have observed in the past that the mosaic/tiles endpoints are slower for a search of all items in a collection without a datetime filter. When a search is registered with both a collection and datetime range filter the resulting tiles render much faster.

@vincentsarago
Copy link
Contributor

👋 The /mosaic/statistics endpoint is really risky to use. By restricting the size of the AOI you may be able to have relatively good performance but it all depends on the assets size (if assets are really small, even a small AOI will try to open too many assets). We could also add a check in https://github.com/stac-utils/titiler-pgstac/blob/master/titiler/pgstac/mosaic.py#L344-L352 to make sure we are not trying to open more than 100 assets for example 🤷

@vincentsarago
Copy link
Contributor

@anayeaye

slower for a search of all items in a collection without a datetime filter

The more filters you add the lest assets you'll pass to rio-tiler to construct the tiles so it should be faster.

@anayeaye
Copy link
Collaborator

anayeaye commented Dec 7, 2022

Realizing I didn't present my comment well--I was going for I think this is expected, not a bug, and we may see that we are good to go here when we profile a filtered request.

@ividito
Copy link
Collaborator Author

ividito commented Dec 13, 2022

Got around to trying a datetime filter, as well as further pared-down test data. Still ran into significant performance issues. After discussing with the team, I'm closing this PR and issue for now, as there isn't an immediate need for this functionality.

@ividito ividito closed this Dec 13, 2022
@anayeaye anayeaye deleted the feature/statistics-endpoint-on-mosaics branch March 12, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable mosaic based stats requests
3 participants