-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
replace benchmark report with usage limits doc
- Loading branch information
Showing
3 changed files
with
314 additions
and
163 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,312 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0644081c-161f-43fa-8a61-7dc0efb26d08", | ||
"metadata": {}, | ||
"source": [ | ||
"# Time series API limits\n", | ||
"\n", | ||
"The `titiler-cmr` API can be deployed as a Lambda function in AWS. Since requests to the time series endpoints will make recursive requests to the Lambda function for the lower-level time step operations, there are some limits in place to avoid making large requests that are likely to overwhelm the API.\n", | ||
"\n", | ||
"## Highlights\n", | ||
"- Maximum of 995 discrete points or intervals in a time series request (due to Lambda concurrency limits)\n", | ||
"- You can use the length of the time series, the AOI size, and the resolution of the dataset to calculate the number of total pixels (`x_pixels * y_pixels * n_time`) which is helpful for determining if a request will succeed\n", | ||
"- The `/timeseries/bbox` endpoint for generating GIFs for a bounding box will struggle on requests for a large AOI and/or a lengthy time series for high spatial resolution datasets. Based on a coarse evaluation of the API, it is estimated that requests that read **less than 100,000,000 total pixels** from the raw data will tend to succeed. There is a limit in place that will cause requests that exceed this limit to fail fast without firing hundreds of doomed Lambda invocations.\n", | ||
"- The `/timeseries/statistics` endpoint can handle larger requests than the `/timeseries/bbox` endpoint Based on a coarse evaluation of the API, requests that read **less than 15,000,000,000 total pixels** from the raw data will tend to succeed, however requests are limited to reading fewer than 56,000,000 pixels for any individual time step.\n", | ||
"\n", | ||
"## Background\n", | ||
"The time series API provides rapid access to time series analysis and visualization of collections in the CMR catalog, but there are some limitations to the API deployment that require some care when making large requests.\n", | ||
"\n", | ||
"There are several factors that must be considered in order to make a successful time series request:\n", | ||
"- Spatial resolution of the dataset (especially for the xarray backend)\n", | ||
"- Request AOI size\n", | ||
"- Number of points/intervals in the time series\n", | ||
"\n", | ||
"These factors all influence the runtime and memory footprint of the initial `/timeseries` request and requests that are too large in any of those dimensions can result in an API failure. Here are a few guidelines to help you craft successful time series requests." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7e436954-d115-4c6a-8aee-40e276532aa0", | ||
"metadata": {}, | ||
"source": [ | ||
"## Details\n", | ||
"\n", | ||
"### Number of points/intervals in the time series\n", | ||
"\n", | ||
"The top factor that determines if a request will succeed or fail is the number of points in the time series. In the default deployment, there is a hard cap of 995 time points in any time series request. This cap is in place because there is a concurrency limit of 1000 on the Lambda function that executes the API requests." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4da8f9c2-fe34-4d16-a72a-04b50decd2c5", | ||
"metadata": {}, | ||
"source": [ | ||
"### Spatial resolution and AOI size\n", | ||
"\n", | ||
"For datasets that use the `rasterio` backend, there will be very few limitations on maximum array size as long as the data are COGs and you specify a reasonable output image size (or use the `max_size` parameter) in your request.\n", | ||
"\n", | ||
"For datasets without overviews/pyramids, `titiler-cmr` will need to read all of the bytes that overlap the request AOI even if the resulting image is going to be downsampled for a GIF. Therefore, if the area of interest for a `/timeseries/statistics` or `/timeseries/bbox` request will create a large array that is likely to exceed the capacity of the Lambda function, the request will fail fast.\n", | ||
"\n", | ||
"The limits for the `xarray` backend are:\n", | ||
"- `/timeseries/bbox`\n", | ||
" - individual image size: `5.6e7` pixels (~7500x7500)\n", | ||
" - total image size (`x_pixels * y_pixels * n_time`): `1e8` pixels\n", | ||
"- `/timeseries/statistics`\n", | ||
" - individual image size: `5.6e7` pixels (~7500x7500)\n", | ||
" - total image size: `1.5e10` pixels\n", | ||
"\n", | ||
"For low-resolution datasets (e.g. 28km or 0.25 degree) you will not run into any issues (unless you request too many time points!) because a request for the full dataset will be reading arrays that are ~1440x720 pixels. \n", | ||
"\n", | ||
"For higher-resolution datasets (e.g. 1km or 0.01 degree), you will start to run into problems as the size of the raw arrays that titiler-cmr is processing increases (and the number of discrete points or intervals increases). " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b1ed1102-5c43-45d5-99c2-1f7591f8225f", | ||
"metadata": {}, | ||
"source": [ | ||
"### Examples\n", | ||
"\n", | ||
"The MUR-SST dataset is good for demonstrating the limits of the time series endpoints with the `xarray` backend. It has high resolution (1 km, 0.01 degree) daily global sea surface temperature observations! With this resolution it is easy to craft a request that will break the `/timeseries` endpoints. Here are some examples of how to manipulate the time series parameters to achieve success with the `/timeseries/bbox` endpoint.\n", | ||
"\n", | ||
"```python\n", | ||
"from datetime import datetime, timedelta\n", | ||
"\n", | ||
"import httpx\n", | ||
"```\n", | ||
"\n", | ||
"Here is a request that will succeed (if the lambda is warmed up):\n", | ||
"- 5x5 degree bounding box (500 x 500 pixels)\n", | ||
"- 180 daily observations (`180 / P1D`)\n", | ||
"- total size: `500 * 500 * 180 = 4.5e7`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-5, -5, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=180)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P1D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"That request is about half of the maximum request size for the `/timeseries/bbox` endpoint. We can push it to the limit by doubling the length of the time series:\n", | ||
"- 5x5 degree bounding box (500 x 500 pixels)\n", | ||
"- 360 daily observations (`360 / P1D`)\n", | ||
"- total size: `500 * 500 * 360 = 9.0e7`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-5, -5, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=360)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P1D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"If we increase the length of the time series such that the request exceeds the maximum size, the API will return an error:\n", | ||
"- 5x5 degree bounding box (500 x 500 pixels)\n", | ||
"- 540 daily observations (`540 / P1D`)\n", | ||
"- total size: `500 * 500 * 540 = 1.35e8` (greater than maximum of `1.0e8`!)\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-5, -5, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=540)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P1D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"We can get get a successful response for the larger time window if we reduce the temporal resolution:\n", | ||
"- 5x5 degree bounding box (500 x 500 pixels)\n", | ||
"- 77 weekly observations (`540 / P7D`)\n", | ||
"- total size: `500 * 500 * 77 = 1.925e7`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-5, -5, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=540)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P7D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"With the weekly temporal resolution we have some room to increase the size of the bounding box!\n", | ||
"- 10x10 degree bounding box (1000 x 1000 pixels)\n", | ||
"- 77 weekly observations (`540 / P7D`)\n", | ||
"- total size: `1000 * 1000 * 77 = 7.7e7`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-10, -10, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=540)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P7D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"If we double the AOI size again, we will break exceed the request size limit:\n", | ||
"- 20x20 degree bounding box (1000 x 1000 pixels)\n", | ||
"- 77 weekly observations (`540 / P7D`)\n", | ||
"- total size: `2000 * 2000 * 77 = 3.08e8` (greater than maximum of `1e8`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-20, -20, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=540)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P7D\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"But if we reduce the temporal resolution from weekly to monthly, it will work!\n", | ||
"- 20x20 degree bounding box (1000 x 1000 pixels)\n", | ||
"- 18 monthly observations (`540 / P1M`)\n", | ||
"- total size: `2000 * 2000 * 18 = 3.08e8`\n", | ||
"\n", | ||
"```python\n", | ||
"bounds = (-20, -20, 0, 0)\n", | ||
"bbox_str = \",\".join(str(x) for x in bounds)\n", | ||
"\n", | ||
"start_datetime = datetime(year=2011, month=1, day=1, hour=0, minute=0, second=1)\n", | ||
"end_datetime = start_datetime + timedelta(days=540)\n", | ||
"\n", | ||
"response = httpx.get(\n", | ||
" f\"https://dev-titiler-cmr.delta-backend.com/timeseries/bbox/{bbox_str}.gif\",\n", | ||
" params={\n", | ||
" \"concept_id\": \"C1996881146-POCLOUD\",\n", | ||
" \"datetime\": \"/\".join(dt.isoformat() for dt in [start_datetime, end_datetime]),\n", | ||
" \"step\": \"P1M\",\n", | ||
" \"variable\": \"analysed_sst\",\n", | ||
" \"backend\": \"xarray\",\n", | ||
" \"rescale\": \"273,315\",\n", | ||
" \"colormap_name\": \"viridis\",\n", | ||
" \"temporal_mode\": \"point\",\n", | ||
" },\n", | ||
" timeout=None,\n", | ||
")\n", | ||
"```\n", | ||
"\n", | ||
"However, there is a maximum image size that we can read with the `xarray` backend, so we cannot increase the bounding box indefinitely. The limit imposed on the API at this time is `5.6e7` pixels (7500 x 7500 pixels). In the case of MUR-SST, that is a bounding box of roughly 75 x 75 degrees." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4033af4c-6c85-45d5-9e5e-f2a15af471ab", | ||
"metadata": {}, | ||
"source": [ | ||
"## Tips\n", | ||
"\n", | ||
"- If you hit an error because the total size of the request is too large, try reducing the temporal resolution of the time series, e.g. from daily (`P1D`) to weekly (`P7D`) or greater (`P10D`)\n", | ||
"- If you need higher temporal resolution but the full request is not able handle it, split the request into multiple smaller requests and merge the results yourself!" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.9" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.