Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reconcile Cocina object versions with actual preserved object versions #5074

Open
andrewjbtw opened this issue Jun 5, 2024 · 2 comments
Open
Labels

Comments

@andrewjbtw
Copy link

andrewjbtw commented Jun 5, 2024

The SDR has supported versions via Moab versioning for many years, but for many of those years SDR also had problems keeping version information in sync across Fedora, the workflow service (which records version numbers), and preservation. This resulted in numerous errors where version mismatches blocked accessioning: Fedora said object is on version N, but preservation said version N +- 1).

The most common way to handle version mismatch errors was to edit the versionMetadata datastream (see the now-deprecated instructions). These edits were sufficient to allow objects to finish accessioning, but behind the scenes this apparently left behind inconsistent version metadata. This is a class of problem that remains hidden until someone tries to update an affected druid. No repository process monitors for mismatched version metadata across the various systems.

Two examples:

rt675ky1058
dj813zv3194

Both of these items say they are on version 3, according to Cocina/DSA. But the history shows only two versions. But these versions are numbered 1 and 3.

Screenshot 2024-06-05 at 4 14 03 PM

The workflows also suggest only two versions have been updated, 1 and 3: The accesionWF should be recorded in the workflow DB for every accessioned version.

Screenshot 2024-06-05 at 4 14 42 PM

Preservation also has only 2 versions, but these are numbered the expected way, 1 and 2:

./rt675ky1058/v0001
./rt675ky1058/v0002

Impact

It's not possible to open items in this state and create new versions. Attempting to open rt675ky1058 leads to this error:

Dor::Services::Client::UnexpectedResponse: Unable to open version (Version from Preservation is out of sync. Preservation expects 2 but current version is 3)

I think that means preservation expects to be told the current version is 2 and the next version is 3, but it's being told the current version is 3.

Reconciliation options

There are at least two possible ways to handle this:

  • Create the "missing" version in preservation to sync with version in DSA
    • in the examples above, we'd deposit version 3 for each of these items
  • Change the DSA version numbers to match the numbers in preservation
    • in the examples above, we'd make DSA correspond to the 2 versions in preservation

Next steps

Either way, we'll need to generate a report for this issue. There were many version mismatch problems in the history of SDR. Possibly tens of thousands of version mismatch problems.

@andrewjbtw andrewjbtw added the bug label Jun 5, 2024
@andrewjbtw andrewjbtw moved this to New Issues (Needs Triage) in Infrastructure Portfolio Production Priorities Jun 5, 2024
@justinlittman justinlittman moved this from New Issues (Needs Triage) to In Progress (Not Ordered) in Infrastructure Portfolio Production Priorities Jun 7, 2024
@justinlittman justinlittman self-assigned this Jun 7, 2024
justinlittman added a commit that referenced this issue Jun 7, 2024
justinlittman added a commit that referenced this issue Jun 7, 2024
justinlittman added a commit that referenced this issue Jun 7, 2024
@justinlittman
Copy link
Contributor

Here's the report @andrewjbtw

preservation_versions_report.csv

@andrewjbtw
Copy link
Author

Only 442 items, which is a relief.

@justinlittman justinlittman removed their assignment Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress (Not Ordered)
Development

No branches or pull requests

2 participants