Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection search #735

Merged
merged 25 commits into from
Oct 15, 2024
Merged

Collection search #735

merged 25 commits into from
Oct 15, 2024

Conversation

hrodmn
Copy link
Collaborator

@hrodmn hrodmn commented Sep 26, 2024

Related Issue(s):

Description:
Collection discovery is a challenge in the current environment. A user might know which catalog or API they want data from but they do not know have the actual collection_id that they will need to perform an item-level search. The STAC API Collection Search Extension makes it possible for a user to search apply filters to collection-level metadata. This is most useful when the STAC API Free Text Extension is enabled because a user can search an API for all collections with a term like q=DEM to find all collections that have the term DEM in the title, description, or keywords.

Since most APIs do not currently have the collections earch extension enabled, I added some client-side filtering logic to make the CollectionSearch class request the full list of collections from the /collections endpoint then apply a limited set of filters (datetime, bbox, q) to the list.

  • Refactor ItemSearch class to inherit from a new BaseSearch class so methods can be shared between ItemSearch and CollectionSearch classes
  • Add CollectionSearch class
  • Add Client.collection_search method
  • Add collection search functionality to cli.py
  • Add new tests
    • CollectionSearch
    • Client.collection_search
    • test_cli.py

PR Checklist:

  • Code is formatted
  • Tests pass
  • Changes are added to the CHANGELOG

@hrodmn
Copy link
Collaborator Author

hrodmn commented Sep 26, 2024

I added the Client.collection_search method but now I wonder if it would make more sense to add the optional filter args to Client.collections instead since that would follow the pattern from the STAC API a bit more closely. When the collection search extension is enabled, you perform a collection search by adding query parameters like bbox and q to GET requests on the /collections endpoint.

@codecov-commenter
Copy link

codecov-commenter commented Sep 26, 2024

Codecov Report

Attention: Patch coverage is 91.21951% with 18 lines in your changes missing coverage. Please review.

Project coverage is 93.68%. Comparing base (21435b0) to head (07e5187).
Report is 81 commits behind head on main.

Files with missing lines Patch % Lines
pystac_client/collection_search.py 91.60% 11 Missing ⚠️
pystac_client/item_search.py 77.77% 4 Missing ⚠️
pystac_client/cli.py 86.95% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #735      +/-   ##
==========================================
+ Coverage   93.43%   93.68%   +0.25%     
==========================================
  Files          13       15       +2     
  Lines         990     1188     +198     
==========================================
+ Hits          925     1113     +188     
- Misses         65       75      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gadomski gadomski self-requested a review September 26, 2024 21:12
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I did an initial pass and left a few comments. I'll do a more thoughtful review later.

pystac_client/client.py Show resolved Hide resolved
pystac_client/collection_search.py Outdated Show resolved Hide resolved
pystac_client/search.py Outdated Show resolved Hide resolved
@hrodmn hrodmn force-pushed the collection-search branch from fa25492 to fee6c97 Compare October 8, 2024 18:36
@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 8, 2024

I think the last big thing to add here is a cli method for collection search. @gadomski what do you think about adding the search args to the collections method in the CLI? I could also add a collection-search method, but filter parameters to the existing collections method would fit naturally with the STAC API experience (e.g. /collections?q=sentinel).

@gadomski
Copy link
Member

gadomski commented Oct 8, 2024

what do you think about adding the search args to the collections method in the CLI?

Yup, makes sense to me!

@hrodmn hrodmn marked this pull request as ready for review October 9, 2024 15:50
@gadomski gadomski self-requested a review October 9, 2024 23:19
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look at the CI errors, looks like you'll need to use pytest.warns to catch-and-assert the client-side filtering warnings.

docs/quickstart.rst Outdated Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
pystac_client/search.py Outdated Show resolved Hide resolved
pystac_client/collection_search.py Outdated Show resolved Hide resolved
pystac_client/collection_search.py Outdated Show resolved Hide resolved
tests/test_collection_search.py Outdated Show resolved Hide resolved
@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 10, 2024

Taking a look at the CI errors, looks like you'll need to use pytest.warns to catch-and-assert the client-side filtering warnings.

Argh, yeah. I need to start running scripts/test instead of pytest.

Thanks for the review, I'll get those changes in today!

@gadomski
Copy link
Member

I need to start running scripts/test instead of pytest.

or develop an allergic reaction to all warnings, like I have (don't recommend leads to lots of yak shaving) :-)

@hrodmn hrodmn force-pushed the collection-search branch from ee88cef to faa8d8c Compare October 10, 2024 11:19
@hrodmn hrodmn force-pushed the collection-search branch from 1457fbb to fce4bd0 Compare October 10, 2024 14:27
pystac_client/collection_search.py Outdated Show resolved Hide resolved
@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 10, 2024

@gadomski thanks for your reviews, sorry for not catching those little CI issues and for half-accepting your suggestion on matched!

Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only thing missing is a CHANGELOG entry. Thanks for the iterations @hrodmn!

@gadomski gadomski enabled auto-merge (squash) October 15, 2024 13:21
@gadomski gadomski merged commit 3fe2670 into stac-utils:main Oct 15, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for STAC API - Collection Search
3 participants