Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define minimal metadata templates for different levels #79

Open
JolandaS opened this issue Mar 19, 2020 · 4 comments
Open

Define minimal metadata templates for different levels #79

JolandaS opened this issue Mar 19, 2020 · 4 comments
Assignees
Labels

Comments

@JolandaS
Copy link
Collaborator

Define minimal data, based on the transcriptomics metadata/ontologies found (see #75 ) for different levels:

  • shallow (e.g. disease) - not necessarily transcriptomics specific
  • data type specific (e.g. cell type)
  • experiment (e.g. experiment type)
@JolandaS JolandaS added the UC9 label Mar 19, 2020
@JolandaS JolandaS added this to the Virtual meeting End of April 2020 milestone Mar 19, 2020
@JolandaS
Copy link
Collaborator Author

JolandaS commented Mar 19, 2020

@PeterWoollard @daniwelter @karsten-quast I defined a new issue based on our discussions yesterday, to help you forward in the next step of defining transcriptomics templates. Please feel free to change/correct the description of this issue.

@mkoatwork
Copy link
Collaborator

Template_SummarizedDataForGRIT42_V2.xlsx

This is the template for uploading summarized AMR data into a shared Repository based on GRIT42 software. As explained on the worksheet called 'Info' some columns should be filled out by using the drop down feature. All list of values are colected in a worksheet called 'Dictionary'. Beside the list of values we have additional columns for the ontology terms. A simple example is shown in columns U and V. The gender 'Male' is resolved by the NCIT term http://purl.obolibrary.org/obo/NCIT_C16576. But in column Q it is more comlpicated as we don't have for all the bacterail strains a corresponding term in the NCBI Taxonomy. Only for the first row 'ACIBA 19606' which is Acinetobater baumannii 19606 there is corresponding URI 'http://purl.obolibrary.org/obo/NCBITaxon_575584'. For the next row we only have the species name in NCBI Taxonomy. Therefore I prosed to use a JSON structure similar to JSONLD to describe both terms 'species' and 'strain' as separate key value pairs. Thus we can use a freetext instead of a URI for all strains where no term is given in NCBI (see cell U3).
The same approach can then be used to discribe the Experiment (column A). Here we have multiple terms like 'Accumulation' and 'bacteria' which can be discribed in a similar JSONLD structure (see column B for examples). Unfortuantely here we don't have terms like 'strain' or 'species' to describe the experiment.
My questions:
Is JSONLD useful to describe combinations of terms like in the species - strain case or is there another option?
If yes, what term should we use to build the key for the key:value pair (see proposals in column B)?
How should we handle multiple ontologies like in cells B3 and B5?
Thanks
Manfred

@PeterWoollard
Copy link
Collaborator

Hi Manfred,
I agree, handling species and strain as two separate entities is good practice.
(and yes would like to see the mammalian organisms with scientific name + NCBI taxononomy IDs.
column B? I am seeing batch -id Noso001-1
Yes URI or URLs make sense for FAIRness

Obviously for human non-pathogen experiments, which before COVID was the majority. this template works. For other experiments, a slightly different template will be needed. It is a good exemplar though.

Thanks,
Peter

@mkoatwork
Copy link
Collaborator

mkoatwork commented Apr 1, 2020 via email

@JolandaS JolandaS removed this from the Virtual meeting End of April 2020 milestone May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants