You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, veda-backend allows users to register searches for items in a STAC catalog. These searches can be performed based on the collection field or using a filter argument that includes a filter on collection. Searches have a usecount, but it is difficult to aggregate search uses based on the collections or assets they are accessing.
To improve observability and tracking, we can make the following enhancements:
Associating requests with targeted collections: Inject a metrics dependency into tile requests to associate requests with the targeted asset or collections. By exporting usage metrics to Cloudwatch, we would be able to use a dashboard to track requests against different collections or assets. If we chose to collect these metrics at the asset level, we could use the data to prioritize caching of frequently-accessed files.
Tracking searches using collection and filter: For searches using the collection field, we can already generate simple statistics based on the number of pgstac queries made against each search (using the usecount column associated with each search). However, tracking searches that use the filter argument instead is challenging. Ideally, we would be able to aggregate comparable searches. In order to do this, I think it could be helpful to refactor the register endpoint to convert filter statements to equivalent collection field values. A side-effect of this change would be the consolidation of searches, which could increase the effectiveness of caching mechanisms.
Acceptance Criteria
Select one of the following solutions:
Either:
Extend one of the existing dependencies in titiler to export metrics to Cloudwatch, associated with collection-ids or assets
Create a SQL query that can aggregate the usecount of searches over the same collection (failed attempt included below)
The register endpoint is restructured to convert searches filtering on collection to use the collection field.
Finally:
Ensure that the chosen solution doesn't result in performance issues (particularly the /register endpoint changes, where that endpoint is called every time a layer is rendered in the dashboard)
Additional Information
Simple statistics for searches using collection:
SELECT usecount, search->'collections', jsonb_array_length(search->'collections')
FROM pgstac.searches
WHERE jsonb_array_length(search->'collections') > 0
ORDER BY jsonb_array_length(search->'collections') DESC
Attempt at including searches using filter:
SELECT *
FROM (
SELECT usecount,
search,
search->'collections' AS collections,
jsonb_array_length(search->'collections') AS collections_length
FROM pgstac.searches
WHERE jsonb_path_exists(search, '$.collections')
UNION ALL
SELECT usecount,
search,
jsonb_agg(jsonb_extract_path(search, 'filter', 'args', 'args')) FILTER (WHERE jsonb_typeof(jsonb_extract_path(search, 'filter', 'args', 'args')) = 'array') AS collections,
jsonb_array_length(jsonb_extract_path(search, 'filter', 'args', 'args')) AS collections_length
FROM pgstac.searches
WHERE NOT jsonb_path_exists(search, '$.collections')
GROUP BY search, usecount
) AS subquery
GROUP BY search, collections, collections_length, usecount
ORDER BY collections_length
LIMIT 1000;
The text was updated successfully, but these errors were encountered:
Related: https://github.com/NASA-IMPACT/veda-architecture/issues/267
Problem
Currently,
veda-backend
allows users to register searches for items in a STAC catalog. These searches can be performed based on thecollection
field or using afilter
argument that includes a filter oncollection
. Searches have ausecount
, but it is difficult to aggregate search uses based on the collections or assets they are accessing.Example using
collection
:Example using
filter
(this is how the dashboard uses this endpoint):Proposed Solutions
To improve observability and tracking, we can make the following enhancements:
Associating requests with targeted collections: Inject a
metrics
dependency into tile requests to associate requests with the targeted asset or collections. By exporting usage metrics to Cloudwatch, we would be able to use a dashboard to track requests against different collections or assets. If we chose to collect these metrics at the asset level, we could use the data to prioritize caching of frequently-accessed files.Tracking searches using
collection
andfilter
: For searches using thecollection
field, we can already generate simple statistics based on the number ofpgstac
queries made against each search (using theusecount
column associated with each search). However, tracking searches that use thefilter
argument instead is challenging. Ideally, we would be able to aggregate comparable searches. In order to do this, I think it could be helpful to refactor theregister
endpoint to convertfilter
statements to equivalentcollection
field values. A side-effect of this change would be the consolidation of searches, which could increase the effectiveness of caching mechanisms.Acceptance Criteria
Either:
titiler
to export metrics to Cloudwatch, associated withcollection-ids
or assetsusecount
of searches over the samecollection
(failed attempt included below)register
endpoint is restructured to convert searches filtering oncollection
to use thecollection
field.Finally:
/register
endpoint changes, where that endpoint is called every time a layer is rendered in the dashboard)Additional Information
collection
:filter
:The text was updated successfully, but these errors were encountered: