Skip to content

Commit

Permalink
Add index deletions to transformAndIndex.sh script (RPB-230)
Browse files Browse the repository at this point in the history
  • Loading branch information
fsteeg committed Dec 17, 2024
1 parent 81cbee3 commit 83905cd
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions transformAndIndex.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ sbt "runMain rpb.ETL conf/rpb-sw.flux" # creates TSV lookup file for to-lobid tr
# Strapi title data export is incomplete, see https://jira.hbz-nrw.de/browse/RPB-202, so we don't use the approach above (rpb-authority, same for RPPD / person):
## zgrep -a -E '"type":"api::article.article"|"type":"api::independent-work.independent-work"' conf/strapi-export.tar.gz > conf/output/output-strapi.ndjson
# Instead, we use the backup exports created in Strapi lifecycle afterCreate and afterUpdate hooks (copy from backup/ in Strapi instance):
cat conf/articles.ndjson | jq -c .data > conf/output/output-strapi.ndjson
cat conf/independent_works.ndjson | jq -c .data >> conf/output/output-strapi.ndjson
cat conf/articles.ndjson | grep '"data"' | jq -c .data > conf/output/output-strapi.ndjson
cat conf/independent_works.ndjson | grep '"data"' | jq -c .data >> conf/output/output-strapi.ndjson
# Remove old index data:
rm conf/output/bulk/bulk-*.ndjson
sbt "runMain rpb.ETL conf/rpb-titel-to-lobid.flux index=$INDEX"
Expand All @@ -30,6 +30,15 @@ do
echo "$filename"
curl -XPOST --silent --show-error --fail --header 'Content-Type: application/x-ndjson' --data-binary @"$filename" 'weywot3:9200/_bulk' >> conf/output/es-curl-post.log
done

# Delete in Elasticsearch:
cat conf/articles.ndjson | grep '"delete"' | jq --raw-output .delete.rpbId > conf/delete.ndjson
cat conf/independent_works.ndjson | grep '"delete"' | jq --raw-output .delete.rpbId >> conf/delete.ndjson
while read rpbId; do
curl -X DELETE "weywot3:9200/$INDEX/resource/https%3A%2F%2Flobid.org%2Fresources%2F$rpbId"
done < conf/delete.ndjson

# Move alias to new index:
curl -X POST "weywot3:9200/_aliases?pretty" -H 'Content-Type: application/json' -d'
{
"actions" : [
Expand Down

0 comments on commit 83905cd

Please sign in to comment.