-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[traits.build workflow] Add field for specimen identifiers #167
Comments
One of our immediate aims is to add column(s) to the traits.build database structure that allows trait observations to be linked to herbarium records or in instances when a dataset collector has a unique record number that links across trait observations in multiple datasets. We want to be fully compliant with the DwC standard, but also minimise the number of additional fields we add to traits.build, especially as these fields will be blank for the majority of datasets. Looking through DwC, it seems there are two distinct types of "identifiers" that probably need to be added:
I don't think the two identifier categories can be merged or we'd be diverging from the dwc meaning of each. As examples, see this record in ALA, GBIF: https://biocache.ala.org.au/occurrences/60455440-c777-43d9-9cc0-19354cbc8403 https://www.gbif.org/occurrence/2430993462 The AusTraits team set out as a goal to change traits.build as little as possible, but I think before we do this we should contemplate if there are any other “occurrence” metadata fields we should be adding as part of this – at the moment we don’t explicitly include the concept of an “occurrence” in the traits.build structure. It is implicit via A few relevant references: Nelson G, Sweeney P, Gilbert E (2018) Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6, e1027. doi:10.1002/aps3.1027. Folk RA, Siniscalchi CM (2021) Biodiversity at the global scale: the synthesis continues. American Journal of Botany 108, 912–924. doi:10.1002/ajb2.1694. |
Further research suggests dwc:institutionCode will also be required to uniquely link to observations/collections in the ALA, gbif, and other collections. For instance, for the Australian Museum, the catalog number does not include the institution code. |
DarwinCore also has a field dwc:associatedSequences which allows one to link to one or more identifiers for genetic sequence information. This is a new DarwinCore addition as part of their MaterialEntity class. |
Further thoughts with @dfalster
This will be easy to implement and has the advantages that:
|
Comments copied from issue #169 One of our immediate aims is to add column(s) to the traits.build database structure that allows trait observations to be linked to herbarium records or in instances when a dataset collector has a unique record number that links across trait observations in multiple datasets. We want to be fully compliant with the DwC standard, but also minimise the number of additional fields we add to traits.build, especially as these fields will be blank for the majority of datasets. Looking through DwC, it seems there are two distinct types of "identifiers" that probably need to be added:
I don't think the two identifier categories can be merged or we'd be diverging from the dwc meaning of each. As examples, see this record in ALA, GBIF: https://biocache.ala.org.au/occurrences/60455440-c777-43d9-9cc0-19354cbc8403 https://www.gbif.org/occurrence/2430993462 The AusTraits team set out as a goal to change traits.build as little as possible, but I think before we do this we should contemplate if there are any other “occurrence” metadata fields we should be adding as part of this – at the moment we don’t explicitly include the concept of an “occurrence” in the traits.build structure. It is implicit via observationID and an observations geographic location (latitude/longitude) that to observe an organism in a location, on a date, it must have occurred there. A few relevant references: Nelson G, Sweeney P, Gilbert E (2018) Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6, e1027. doi:10.1002/aps3.1027. Folk RA, Siniscalchi CM (2021) Biodiversity at the global scale: the synthesis continues. American Journal of Botany 108, 912–924. doi:10.1002/ajb2.1694. |
Add field(s) to map in specimen identifiers - such as for trait data linked to herbarium vouchers or trait data where the same specimen/individual is measured in multiple datasets.
Need to consult with ALA / GBIF to ensure we include the field(s) that are most used across global biodiversity databases. But possibly, we'll need 2 fields, one for the more generic instance of "same individual measured in different datasets" and a second more formally for herbarium vouchers.
The text was updated successfully, but these errors were encountered: