-
Notifications
You must be signed in to change notification settings - Fork 10
2.8 Modelling approaches
In many ways, the terms that are used to characterise the collection objects within an ObjectGroup
are similar or identical to those that are used to describe an individual collection object. However, there is a fundamental difference in the relationship with those properties between the two examples. For an ObjectGroup
, representing multiple objects, there is always the potential for there to be more than one value for any of those terms.
For example, a single object will only have one preservation method (e.g. ‘dried and pinned’), whereas an ObjectGroup
can represent objects with a variety of preservation methods. The former represents a one-to-many relationship between the term and the object (an object may not have more than one preservation method, but one preservation method may relate to many objects). The latter has a many-to-many relationship between the term and the ObjectGroup
(an ObjectGroup
may reflect more than one preservation method, and one preservation method may relate to many ObjectGroups).
The main impact is on the metrics (represented by the MeasurementOrFact
class) that can be attached to an ObjectGroup
, and how they can be used. If an ObjectGroup
has more than one value for the same term, then it’s not possible to tell how metrics attached to the ObjectGroup
are distributed across those values.
For example, if an ObjectGroup
has a single preservation method of ‘dried and pinned’, and an ‘object quantity’ (represented by a MeasurementOrFact
record) of 10,000, we know that there are 10,000 dried and pinned objects. If however that ObjectGroup
has two preservation methods, ‘dried and pinned’ and ‘alcohol’, we know that there are 10,000 objects that are either dried and pinned OR preserved in alcohol, but we cannot calculate how many there are of each.
The only way to get an accurate assessment of the overall object quantity AND the object quantity for each preservation method is to split the ObjectGroup
into two ObjectGroups: one containing just the ‘dried and pinned’ objects, and one for the ‘alcohol’ objects. This means that the preservation method maintains a one-to-many relationship with the ObjectGroups, rather than a many-to-many relationship. This is the key to being able to accurately aggregate and report metrics against the preservation method property, as well as the ObjectGroups.
We’ve established that:
- Terms can either have a one-to-many or a many-to-many relationship with the
ObjectGroup
- One-to-many relationships between the
ObjectGroup
and a term are required to be able to report accurate metrics against that property
Within the Latimer Core model concept, a term where a one-to-many relationship with the ObjectGroup
has been enforced is referred to as a dimension. These are the terms that are effectively used to determine how a collection needs to be broken down into multiple ObjectGroups, in order to satisfy the requirements for numeric reporting. It’s important to note that the term or terms to be designated as dimensions will vary between implementations, depending on the use case and requirements, and so are not prescribed as part of the Latimer Core model. They can, however, be defined for an implementation using the CollectionDescriptionScheme
structure, as described earlier in the document.
Terms that can have a many-to-many relationship with the ObjectGroup
are referred to, for the purposes of this guidance documentation, as associations. They provide information about what is in the ObjectGroup
, but cannot generally be quantified using the metrics attached to the ObjectGroup
(but see the metric options discussed in section X).
There are both benefits and limitations to applying terms as either dimensions or associations.
Associations essentially allow you to use the properties like ObjectGroup
tags, attaching a number of values for the same property to a single ObjectGroup
(Figure 1).
Figure 1: An ObjectGroup
with two terms (Taxon
and GeographicOrigin
) attached as associations.
Associations enable you to:
- reflect the scope of your collections using a range of properties and their associated values
- keep the data structure relatively simple and maintainable, with a small number of ObjectGroups
- link between ObjectGroups using common properties and vocabularies
However they will only give you basic figures for your collections, restricted to the absolute numbers in the ObjectGroups, and not reflecting the associated properties.
Using dimensions creates a consistent structure to the breakdown of the collection according to one or more common properties. Metrics can be used to accurately describe, analyse and visualise the collections in numeric terms.
As a dimension can only have one value for any given ObjectGroup
, this effectively means the collections being described must be split into as many ObjectGroups as there are values for that dimension. If more than one property is designated as a dimension, then there must be an ObjectGroup
for every valid combination of values in those dimensions.
Figure 2 shows an example where two terms - Taxon
and GeographicOrigin
- have been designated as dimensions. Each dimension has two possible values, so the collection must be split into four ObjectGroups, one for each combination of those two dimensions.
Figure 2: Breaking down a collection using two dimensions.
Metrics (represented by the MeasurementOrFact
class) are attached to each ObjectGroup
, in this example ‘object quantity’. This means that it’s possible to calculate the number of objects within a dimension (e.g. 175 objects from Tanzania or 375 reptiles), across the two dimensions (e.g. 200 Ethiopian mammals, 100 Tanzanian reptiles), and for the collection as a whole by aggregating numbers from all ObjectGroups.
In the real world however a collection is likely to have many more taxa, and many more geographic origins, and so the number of ObjectGroups in the grid would probably be considerably larger. Also, if a third dimension is introduced then the number of ObjectGroups is also multiplied by the number of values in that new dimension. So there are some practical limitations on the number of dimensions that can be used (and the number of values in each dimension), related to the manageability of the data but also primarily the amount of time and effort required to estimate collection metrics at such a detailed and granular level.
Within these constraints, using dimensions is most effective in scenarios for showing collections demographics or inventories, where:
- a structured breakdown of the collection is needed using a small number of properties
- there is a need for dynamic, quantitative reporting across those properties
- there is sufficient resource available to estimate or calculate metrics for a larger number of ObjectGroups, and maintain a more complex dataset
In practice, it is likely that most use cases of the Latimer Core model would be best suited by a combination of terms as dimensions, and terms as associations. For example, an institution might describe its collection by applying an organisational hierarchy as a dimension, so that there is one ObjectGroup
for each OrganisationalUnit
defined by the institution, with associated metrics. To each of those ObjectGroups, there might then be a number of other terms attached as associations to further describe and reflect the scope (e.g. taxonomic or geographic) of the objects within the group.
There are two main options for how metrics can be used within the Latimer Core data model, each with strengths and weaknesses.
Option 1: Metrics attached to the ObjectGroup
The first option (Figure 3) is to attach the metric directly to the ObjectGroup
. This means that you can get an accurate figure for the total number of objects within the ObjectGroup
, but not for the number of objects according to each of the terms (unless the terms are designated as dimensions, and the ObjectGroup
split into multiple ObjectGroups, as described in the previous section).
This option tends to be better suited to the dimensional model, for providing precise statistics.
Figure 3: Attaching metrics directly to the ObjectGroup
class.
Option 2: Metrics attached to the relationship between the ObjectGroup
and each property
The second option (Figure 4) is to attach metrics to the relationships between the ObjectGroup
and the terms (designated as associations). For example, an ObjectGroup has 100 objects from Tanzania, 150 objects from Ethiopia, 75 reptiles and 125 mammals. This means that you have accurate figures relating to each property, but do not know how many objects there are overall - there is no denominator. In practice, this can be achieved by embedding instances of the MeasurementOrFact
in a ResourceRelationship
used to create the relationship between the ObjectGroup
and the term.
Figure 4: Attaching metrics to links between classes, in order to quantify or qualify the relationship.
This option is better for a less structured, more graph-like approach to modelling the collections. It does avoid the need to break down into a greater number of more granular ObjectGroups, as per the dimensional model used in Option 1, but has limitations in providing accurate quantitative data.
As with the two options of using properties as associations or dimensions, there is potential to combine these two approaches to suit the use case.
- Most terms related to an
ObjectGroup
can be used as either an association or a dimension. - Associations require less effort and are good for descriptive purposes, but bad for quantitative reporting.
- Dimensions are good for quantitative reporting, but require more effort, and have limitations in how many can be applied.
- Metrics can potentially be attached either directly to the
ObjectGroup
, or to the relationships between theObjectGroup
and its terms.
Version | Date | Contributors | Status |
---|---|---|---|
1.x | TBD | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | Expert Review - In progress |
1.0 | 2022-06-10 | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | v1 - Archived |
0.1 | 2022-02-10 | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | Draft |