-
Notifications
You must be signed in to change notification settings - Fork 3
Bulk Metadata Modifications
Sometimes we need to modify a bunch of metadata in one fell swoop. A rake task was built to facilitate this process.
- An issue to track the change request
- A spreadsheet with the proper format and data
- Running the rake task on the server
Bulk updates are scheduled to be run weekly on Thursdays. Members of the development team are responsible for self-identifying as responsible for running the weekly updates.
- The value in the
from
column must be filled in every row - A
+
value in thefrom
column indicates that a value will be added the metadata field - A
-
value in thefrom
column indicates that a value or values separated with|
will be removed from the metadata in a multi-value field - The
-
operator is only supported for multi-value unordered fields with expected type set totext
like subject, and alt_title. - A
*
value in thefrom
column indicates that all current values in the metadata field will be replaced - The
to
column can have a pipe-delimited (|
) value to add multiple values to the metadata field - The
to
column can have an empty cell to indicate that a value or values can be removed from the metadata
Example 1:
Consider the following data as an example of what the rake task expects for performing bulk metadata updates:
id,from,to,property
fb494b17q,http://opaquenamespace.org/ns/osuAcademicUnits/qGjPkk5M,http://id.loc.gov/authorities/names/n80017721,degree_grantors
8623j0508,http://opaquenamespace.org/ns/osuAcademicUnits/qGjPkk5M,http://id.loc.gov/authorities/names/n80017721,degree_grantors
- id : The ID of the work to be modified
-
from : The value as it exists in the system, this value can be a
*
to indicate any value. - to : The desired change to the existing value
- property : The property name, as tracked in the Metadata Application Profile, of the metadata value to change
Example 2
The following example uses the -
operator to remove only the item with value http://id.loc.gov/authorities/names/no2011160692
from the subject
property in work gm80hv32k
.
Input subject_remove_uri.csv
:
id,from,to,property,
gm80hv32k,-,http://id.loc.gov/authorities/names/no2011160692,subject,
Example 3
The following example uses the -
and +
operators to replace only the item with value http://id.loc.gov/authorities/names/no2011160692
with Technical note (Forest Products Laboratory (U.S.))
from the subject
property in work gm80hv32k
.
Input subject_replace_uri_with_text.csv
:
id,from,to,property,
gm80hv32k,-,http://id.loc.gov/authorities/names/no2011160692,subject,
gm80hv32k,+,Technical note (Forest Products Laboratory (U.S.)),subject,
Example 4
The following example uses the *
under from
and an empty cell under to
to remove all items from keyword
property in work b2773v82b
.
Sample input SA_Bulk_Edits_25_July_2019_keyword.csv
:
id,from,to,property,
b2773v82b,*,,keyword,
- Transfer the spreadsheet, in comma-separated values format, to the server. A great directory might be
~/tmp/
.$ scp /path/to/file USER@SERVERNAME:/path/to/remote/server/file
- Execute the rake task
$ bundle exec rails scholars_archive:bulk_update_csv csv=/path/to/remote/server/file
- Watch the logs generated by this task
$ ls -ltr log/bulk-update-csv*