Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy docs and add performance benchmark #42

Open
wants to merge 18 commits into
base: develop
Choose a base branch
from

Conversation

hrodmn
Copy link
Contributor

@hrodmn hrodmn commented Dec 3, 2024

This PR adds changes for a) deploying the documentation to GH pages, b) adds an API benchmarking process (and report in the docs) for the time series endpoints. After talking about it some more with the team I think we can be more surgical in the benchmark approach (i.e. predict limits based on Lambda configuration parameters), but the exercise was still helpful for understanding when things fall apart and which knobs you can turn to get time series requests that return a response before the Lambda breaks down. There are definitely some modifications or alternative approaches that are worth considering to improve the capabilities of this application!

A preview of the docs for this PR are available here: https://developmentseed.org/titiler-cmr/pr-previews/pr-42/

I'm not sure if I'll keep the PR preview feature alive since it is so easy to deploy the docs locally (uv sync && uv run mkdocs serve -o) but I'll leave it up for now so others can take a look at the rendered site.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@hrodmn hrodmn changed the title Deploy docs and add performance benchmark WIP: Deploy docs and add performance benchmark Dec 3, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch 8 times, most recently from e4e6813 to acc239f Compare December 3, 2024 15:40
github-actions bot pushed a commit that referenced this pull request Dec 3, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch 11 times, most recently from 7ea2bb8 to abe6317 Compare December 3, 2024 17:11
github-actions[bot]

This comment was marked as resolved.

@hrodmn hrodmn force-pushed the feat/test-performance branch 4 times, most recently from 049c9a7 to f709c65 Compare December 4, 2024 02:03
github-actions bot pushed a commit that referenced this pull request Dec 4, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch from f709c65 to 156f54a Compare December 4, 2024 02:16
Copy link

github-actions bot commented Dec 4, 2024

📚 Documentation preview will be available at: https://developmentseed.github.io/titiler-cmr/pr-previews/pr-42/

Status: ✅ Preview is ready!

@abarciauskas-bgse
Copy link
Contributor

Thanks @hrodmn ! I'm still trying to parse everything that is here but most importantly I want to understand how to interpret the time series benchmarks:

  1. It seems like for statistics endpoints (e.g. creating a time series as a json response) the AOI doesn't matter since the whole granule has to be read.
    a. I'm wondering if this is always the case, like if the granule is chunked in space?
    b. the limit of 1000 points is for this specific datasets' resolution (~0.25 degrees) right?
  2. For the bbox gif endpoint, it sounds like the size of the AOI does matter but is it more generally the size of the returned image, e.g. 512x512 vs 1024x1024?
    a. Similar question for the limit of 500, this is for the specific dataset used?

@hrodmn hrodmn force-pushed the feat/test-performance branch from 03de8c7 to 3fa0b95 Compare December 6, 2024 21:26
github-actions bot pushed a commit that referenced this pull request Dec 6, 2024
github-actions bot pushed a commit that referenced this pull request Dec 7, 2024
github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
@hrodmn
Copy link
Contributor Author

hrodmn commented Dec 16, 2024

After talking about this with the team a few weeks ago I think it does make sense to be more surgical in the benchmarking tests rather than throwing the massive requests at it when we know there will be an error. The Lambda configuration has a concurrency limit of 998, and since the endpoints start to get unstable beyond 500 time points (due to other resource constraints) I think it makes sense to set a maximum time series length of 500 while we consider how/if we want to make the system more efficient.

  1. It seems like for statistics endpoints (e.g. creating a time series as a json response) the AOI doesn't matter since the whole granule has to be read.
    a. I'm wondering if this is always the case, like if the granule is chunked in space?

I think you are correct that it depends on the chunking scheme of the dataset. All of this happens at the rio-tiler level, here is the moment where a full dataset gets cropped down to the bounding box in the part method:
https://github.com/cogeotiff/rio-tiler/blob/83e8fc4ed765445d1aa0267cc3bccee15664b9b3/rio_tiler/io/xarray.py#L301-L305

I will do some testing on this but I hope that if there were some spatial chunking the full dataset is not loaded into memory in order to do that crop operation.

b. the limit of 1000 points is for this specific datasets' resolution (~0.25 degrees) right?

I am adding some tests that use the MUR-SST dataset and I still need to drill into the details but the limits are much lower for that dataset so I will try to update guidance for users based on the resolution of the input dataset.

  1. For the bbox gif endpoint, it sounds like the size of the AOI does matter but is it more generally the size of the returned image, e.g. 512x512 vs 1024x1024?
    a. Similar question for the limit of 500, this is for the specific dataset used?

Yes the resolution of the output image has an impact on the reduce step in the map/reduce process. If we are trying to combine many high resolution images in the original time series request I think we are likely to hit memory and/or timeout limits. I need to go to the Lambda logs to figure out exactly what is going on, though.

github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch from e408669 to e52d5b2 Compare December 16, 2024 17:41
github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
github-actions bot pushed a commit that referenced this pull request Dec 18, 2024
github-actions bot pushed a commit that referenced this pull request Dec 20, 2024
github-actions bot pushed a commit that referenced this pull request Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants