You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once web archive crawl data is retrieved from Archive-It, a manifest must be manually created, then accessioning manually run. An inventory of what has been accessioned must be manually maintained, to ensure that the same content isn't duplicatively retrieved and accessioned.
This work would extend the WASAPI local import utility (DEVQUEUE-14) to automate the rest of the crawl data accessioning. This would reduce the amount of effort needed to preserve our web archives and make them available for access through SWAP.
Level of effort estimate from @ndushay and @jmartin-sul (who both originally worked on the wasapi-downloader) is 3-4 developers for 4-6 weeks. Once DEVQUEUE-209 is completed, the services will largely be in place and just need to be connected, but there is some question of whether all of the manual steps are at this point fully automatable without capturing additional information that isn't currently stored.
The text was updated successfully, but these errors were encountered:
From: https://jirasul.stanford.edu/jira/browse/DEVQUEUE-189
Once web archive crawl data is retrieved from Archive-It, a manifest must be manually created, then accessioning manually run. An inventory of what has been accessioned must be manually maintained, to ensure that the same content isn't duplicatively retrieved and accessioned.
This work would extend the WASAPI local import utility (DEVQUEUE-14) to automate the rest of the crawl data accessioning. This would reduce the amount of effort needed to preserve our web archives and make them available for access through SWAP.
Level of effort estimate from @ndushay and @jmartin-sul (who both originally worked on the wasapi-downloader) is 3-4 developers for 4-6 weeks. Once DEVQUEUE-209 is completed, the services will largely be in place and just need to be connected, but there is some question of whether all of the manual steps are at this point fully automatable without capturing additional information that isn't currently stored.
The text was updated successfully, but these errors were encountered: