Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
jacksonj04 committed Nov 11, 2024
1 parent 2fb68b2 commit d8e9977
Show file tree
Hide file tree
Showing 4 changed files with 67 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Date: 2022-02-21

## Status

Accepted
Superseded by [ADR 21](0021-update-data-structure.md)

## Context

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Date: 2022-02-23

## Status

Accepted
Superseded by [ADR 21](0021-update-data-structure.md)

## Context

Expand Down
2 changes: 1 addition & 1 deletion doc/adr/0018-pseudo-ncns.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 17. pseudo NCNs
# 18. Pseudo-NCNs

Date: 2024-04-24

Expand Down
64 changes: 64 additions & 0 deletions doc/adr/0021-update-data-structure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# 21. Update data structure to better represent concepts of documents, revisions, representations and versions.

Date: 2024-11-07

## Status

Draft

## Context

The original data structure of the Find Case Law service was designed to provide a rapid path to deployment for the original corpus of documents, based on an initial set of requirements. This ultimately resulted in all information regarding a "document" being stored in a single entity within the MarkLogic database. This included [metadata](ADR0005) and [versions](ADR0005)

Since then the complexity of the service has grown to include a number of requirements not fully covered by this original design. These are testing the limits of the current data structure, requiring work-arounds and special cases to handle things including multiple document types, varying source documents, absent identifiers, changing identifiers, document relationships, multiple language representations, additional metadata, richer provenance and history, new reporting requirements and more.

It has reached the point where the prudent thing to do is to update the fundamental data structure to one which provides a better framework for these requirements.

### Existing ADRs

This ADR has been written with particular attention to the context of the following ADRs:

- [5. Use the Marklogic Library Services API for document versioning](ADR0005)
- [7. Use Document Properties To Store Non-LegalDocML Metadata](ADR0007)
- [14. Versioning of Documents](ADR0014)
- [18. Pseudo-NCNs](ADR0018)
- [19. Other Formats](ADR0019)

## Decision

We will implement a new underlying data structure to embody the following distinct concepts:

### Concepts

#### Document

A `Document` in this framework is the abstract concept of a document, usually representing a judgment or other form of decision. It is broadly analogous to a "Work" under the FRBR[^frbr] model.

#### Revision

#### Representation

#### Version

### Migration path

Existing work has helped us start to distinguish some of these concepts, meaning we are able to provide an incremental migration path. This reduces risk to the service as all changes can be designed to be smaller and incremental, maintaining a rollback path at all times.

## Consequences

- API Client will need incremental redesign to understand content within this new framework.
- It will be necessary to perform some data migrations of all existing documents into the new framework. These should be possible to do incrementally, rather than requiring a maintenance window.
- We will need to reconsider how searches are performed across the corpus.

### Supersedes

- This supersedes [ADR 5](ADR0005), in that revisions of a submitted document (and versions of representations of that document) will now be stored as distinct entities without relying on MarkLogic's DLS.
- This supersedes [ADR 7](ADR0007), since metadata pertaining to the abstract concept of the document will now be stored in the document record.

[^frbr]: https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records "FRBR"

[ADR0005]: 0005-use-the-marklogic-library-services-api-for-document-versioning.md "Use the Marklogic Library Services API for document versioning"
[ADR0007]: 0007-use-document-properties-to-store-non-legaldocml-metadata.md "Use Document Properties To Store Non-LegalDocML Metadata"
[ADR0014]: 0014-versioning-of-documents.md "Versioning of documents"
[ADR0018]: 0018-pseudo-ncns.md "Pseudo-NCNs"
[ADR0019]: 0019-other-formats.md "Other Formats"

0 comments on commit d8e9977

Please sign in to comment.