Skip to content

Commit

Permalink
Aggregation (#276)
Browse files Browse the repository at this point in the history
**Related Issue(s):**

- #245 

**Description:**
Adds support for The [Aggregation
Extension](https://github.com/stac-api-extensions/aggregation) with an
added dependency on the Filter Extension. This enables geo-aggregation
of geometries and points, taking advantage of Opensearch and
Elasticsearch's aggregation engines.

Note that some of the geo-aggregation features will have to be left
untested on the Elasticsearch backend implementation because they
require a commercial license.


TO DO:
- ~~Need to support collection aggregations using the `<collection
ID>/aggregate` route. Add support in the aggregate function in the core
aggregations extension.~~

**PR Checklist:**

- [x] Code is formatted and linted (run `pre-commit run --all-files`)
- [x] Tests pass (run `make test`)
- [x] Documentation has been updated to reflect changes, if applicable
- [x] Changes are added to the changelog
  • Loading branch information
jamesfisher-geo authored Jul 14, 2024
1 parent 5d52698 commit ee365ab
Show file tree
Hide file tree
Showing 16 changed files with 65,241 additions and 679 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [Unreleased]

### Added
- Added an implementation of the Aggregation Extension. Enables spatial, frequency distribution, and datetime distribution aggregations. [#276](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/276)
- Added support for route depndencies configuration through the STAC_FASTAPI_ROUTE_DEPENDENCIES environment variable, directly or via json file. Allows for fastapi's inbuilt OAuth2 flows to be used as dependencies. Custom dependencies can also be written, see Basic Auth for an example. [#251](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/251)
- Added docker-compose.route_dependencies_file.yml that gives an example of OAuth2 workflow using keycloak as the identity provider. [#251](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/251)
- Added docker-compose.route_dependencies_env.yml that gives an example using the STAC_FASTAPI_ROUTE_DEPENDENCIES environment variable. [#251](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/251)
Expand Down
106 changes: 106 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,3 +278,109 @@ The modified Items with lowercase identifiers will now be visible to users acces
## Auth

Authentication is an optional feature that can be enabled through `Route Dependencies` examples can be found and a more detailed explanation in [examples/auth](examples/auth).


## Aggregation

Sfeos supports the STAC API [Aggregation Extension](https://github.com/stac-api-extensions/aggregation). This enables geospatial aggregation of points and geometries, as well as frequency distribution aggregation of any other property including dates. Aggregations can be defined at the root Catalog level (`/aggregations`) and at the Collection level (`/<collection_id>/aggregations`). The `/aggregate` route also fully supports base search and the STAC API [Filter Extension](https://github.com/stac-api-extensions/filter). Any query made with `/search` may also be executed with `/aggregate`, provided that the relevant aggregation fields are available,


A field named `aggregations` should be added to the Collection object for the collection for which the aggregations are available, for example:

```json
"aggregations": [
{
"name": "total_count",
"data_type": "integer"
},
{
"name": "datetime_max",
"data_type": "datetime"
},
{
"name": "datetime_min",
"data_type": "datetime"
},
{
"name": "datetime_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "datetime"
},
{
"name": "sun_elevation_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "numeric"
},
{
"name": "platform_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
},
{
"name": "sun_azimuth_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "numeric"
},
{
"name": "off_nadir_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "numeric"
},
{
"name": "cloud_cover_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "numeric"
},
{
"name": "grid_code_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
},
{
"name": "centroid_geohash_grid_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
},
{
"name": "centroid_geohex_grid_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
},
{
"name": "centroid_geotile_grid_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
},
{
"name": "geometry_geohash_grid_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "numeric"
},
{
"name": "geometry_geotile_grid_frequency",
"data_type": "frequency_distribution",
"frequency_distribution_data_type": "string"
}
]
```

Available aggregations are:

- total_count (count of total items)
- collection_frequency (Item `collection` field)
- platform_frequency (Item.Properties.platform)
- cloud_cover_frequency (Item.Properties.eo:cloud_cover)
- datetime_frequency (Item.Properties.datetime, monthly interval)
- datetime_min (earliest Item.Properties.datetime)
- datetime_max (latest Item.Properties.datetime)
- sun_elevation_frequency (Item.Properties.view:sun_elevation)
- sun_azimuth_frequency (Item.Properties.view:sun_azimuth)
- off_nadir_frequency (Item.Properties.view:off_nadir)
- grid_code_frequency (Item.Properties.grid:code)
- centroid_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.Properties.proj:centroid)
- centroid_geohex_grid_frequency ([geohex grid](https://opensearch.org/docs/latest/aggregations/bucket/geohex-grid/) on Item.Properties.proj:centroid)
- centroid_geotile_grid_frequency (geotile on Item.Properties.proj:centroid)
- geometry_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.geometry)
- geometry_geotile_grid_frequency ([geotile grid](https://opensearch.org/docs/latest/aggregations/bucket/geotile-grid/) on Item.geometry)

Support for additional fields and new aggregations can be added in the associated `database_logic.py` file.
Loading

0 comments on commit ee365ab

Please sign in to comment.