Skip to content

Migrate data from Sufia 6 to Sufia 7.2 [Work In Progress]

hackmastera edited this page Nov 17, 2016 · 6 revisions

Please note these scripts are still under development and are not ready for production use. But please do start testing and check out the outstanding issues.

This is a two step process. From a Sufia 6 application export to a set of JSON files the metadata of all the GenericFiles and Collections. From a Sufia 7 application read the JSON files and create GenericWorks/FileSet/Files, then Collections.

Step 1 (from a Sufia 6 application)

  • Upgrade your app to the latest Sufia 6 release, and then pin to sufia-6.x to get the most recent export code.
  • You'll want to put your application in read-only or maintenance mode.
  • First run the survey script; this writes the ID of every collection and generic file to the database: $ RAILS_ENV=production bundle exec sufia_survey -v
  • Then run the export; see options via --help as there are ways to override the exported fields if you have customized your data model: $ bundle exec sufia_export --help $ RAILS_ENV=production bundle exec sufia_export
    • The export will output to a JSON file the metadata for each file, but not the actual binary of the file. The binary will be read from Fedora at the time of the import.
  • Make sure your Fedora port is open to whatever server will be running the import.
  • Check your json files. If you've had fedora.yml set up with '127.0.0.1' and will be migrating on a different server, you will need to replace all instances of 127.0.0.1 with a real IP or domain name.
  • Move all those json files to the import location.

Step 2 (from a Sufia 7.2 application)

  • pin to the 7.2-migration branch of Sufia.
  • Make sure you've configured fedora_sufia6_user and fedora_sufia6_password in config/application (or a new file in config/initializers) so you can reach your fedora instance to retrieve the binaries.
  • In a Sufia 7 application import the JSON files exported from the Sufia 6 application. Again, use --help to see options; you may want to create overrides if you have customized your data model: $ RAILS_ENV=production bundle exec sufia_import
    • In particular take a look at your rights data values; the default for "All rights reserved" has changed from a string to a URI and you may want to take this opportunity to migrate your data.
  • Derivatives will not be migrated; they will be re-generated by Sufia 7.
  • [Future step -- in development] Run the validation script, which checks that all the ids recorded during the survey step have been migrated.

Outstanding issues

(not an exact list because some of these are related to code migration as opposed to data migration) https://github.com/projecthydra/sufia/issues?q=is%3Aopen+is%3Aissue+label%3Amigration