Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collection page list/search endpoint #2354

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

tw4l
Copy link
Member

@tw4l tw4l commented Jan 30, 2025

Fixes #2353

Adds a new endpoint to list pages in a collection, with filtering available on url (exact match), ts, urlPrefix, isSeed, and depth, as well as accompanying tests. Additional sort options have been added as well.

These same filters and sort options have also been added to the crawl pages endpoint.

Also fixes an issue where isSeed wasn't being set in the database when false but only added on serialization, which was preventing filtering from working as expected.

@tw4l tw4l marked this pull request as ready for review January 30, 2025 20:02
@tw4l tw4l requested a review from ikreymer January 30, 2025 20:02
@tw4l tw4l changed the title WIP: Add collection page list/search endpoint Add collection page list/search endpoint Jan 30, 2025
@ikreymer
Copy link
Member

ikreymer commented Feb 4, 2025

We should probably do this for crawls as well, so larger crawls can be replayed easily w/o having to be added to collections..

@tw4l tw4l force-pushed the issue-2352-collection-page-list-search branch from 7a75fb2 to 29028e7 Compare February 4, 2025 22:51
@tw4l tw4l force-pushed the issue-2352-collection-page-list-search branch from 29028e7 to 79972a4 Compare February 5, 2025 20:51
@tw4l tw4l marked this pull request as draft February 5, 2025 21:01
@tw4l tw4l marked this pull request as ready for review February 6, 2025 18:04
@tw4l
Copy link
Member Author

tw4l commented Feb 6, 2025

@ikreymer This is ready for re-review. Changes made:

  • isSeed and depth filters and sort options added to both crawl and collection pages endpoints
  • Fixed issue where isSeed was being set to false for pages on serialization but kept null in the database, which was preventing filtering from working as expected

The only remaining test failure is the unrelated QA failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Add backend endpoint to list/search pages
2 participants