Skip to content

Commit

Permalink
create page for data type and reference it in both sample and experim…
Browse files Browse the repository at this point in the history
…ent page
  • Loading branch information
wizardfan committed Sep 12, 2017
1 parent c4df61b commit ea22691
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 97 deletions.
48 changes: 48 additions & 0 deletions docs/faang_data_type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Data types for FAANG attributes

[BioSamples](http://www.ebi.ac.uk/biosamples) takes sample records with a set of attributes. Each attribute has a name and a value. It can also have 'Units', or a 'Term Source' and a 'Term Source ID'. The Term Source and ID allow us to refer to entries in other databases or ontologies. This is fully described on the [BioSamples help pages](http://www.ebi.ac.uk/biosamples/help/st_scd.html). The following section describes the expectations for each data type within FAANG.

### date

Dates should be reported in an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format, YYYY-MM-DD for dates or YYYY-MM for months. To ensure clarity, the format must be reported as the 'units'.

### NCBI taxon ID

A species name and identifier from the [NCBI Taxonomy database](http://www.ncbi.nlm.nih.gov/taxonomy). For example, [human](http://www.ncbi.nlm.nih.gov/taxonomy/9606) would be described in the term with value of 'Homo sapiens', term source as 'NCBI Taxonomy' and term source ID as 9606.

### number

A number, with units specified. BioSamples recommends that units are given without abbreviations. Terms defined in the [UO](http://www.ebi.ac.uk/ols/ontologies/uo) are encouraged to be used. For example, a birth weight could have a value of 1.3 and the units specified as '[kilogram](http://www.ebi.ac.uk/ols/ontologies/uo/terms?short_form=UO_0000009)' .

### protocol

A URL link to a protocol document on the FAANG FTP site. Please contact the [FAANG data coordination centre](mailto:[email protected]) to have your protocol documents added to the FTP site.

### text

Text, using US English spellings.

### URL

A URL, such as 'http://faang.org/'. Depending on the context, http, ftp, mailto links may be appropriate. Examples:

* ftp, ftp://ftp.faang.ebi.ac.uk/ftp/README
* http, http://faang.org/
* mailto, mailto:[email protected]


### ontology term

A reference to an ontology term. The attribute value should be the term label. The term source should be the ontology used, and the term source ID should be an ID from that ontology. For example, cerebral cortex could be described with an ontology term from 'UBERON' with ontology ID of 'UBERON:0000956' and the attribute value is 'cerebral cortex'. Though in the experiment submission, direct links to ontologies cannot be submitted as attributes. The use of ontology terms is still encouraged by setting the attribute value to exactly match the term name in the ontology.

### location

A location should be reported as using three attributes:

* `location` (*text*) name of the location
* `location latitude` (*number*) latitude in decimal degrees. Units should be reported as 'decimal degrees'
* `location longitude`(*number*) longitude in decimal degrees. Units should be reported as 'decimal degrees'

### sample

Samples can be referred to in two ways. If the sample you need to reference is in the submission, use the sample name. If the sample was already submitted, use the BioSample ID (e.g. SAMEA2821491).
51 changes: 4 additions & 47 deletions docs/faang_experiment_metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ Requirements are laid out like this:

* `attribute name` (*data type*) a brief description

The data types will be described later in this document. The metadata & data sharing (M&DS) group will seek guidance from the animals, samples and assays (ASA) group on what needs to be recorded here for each assay type.
The details of data types can be found [here](faang_data_type.md).

SRA databases (ENA , NCBI, DDBJ) takes experiment records with a set of attributes. Each attribute has a name and a value, and can also have units. In contrast with the [BioSamples](www.ebi.ac.uk/biosamples) database, they do not have direct support for ontology terms.
The metadata & data sharing (M&DS) group will seek guidance from the animals, samples and assays (ASA) group on what needs to be recorded here for each assay type.

Each assay type will require metadata in addition to the core set of common attributes. The initial set proposed is based upon the [IHEC metadata standards](http://ihec-epigenomes.org/research/reference-epigenome-standards/)

Expand Down Expand Up @@ -204,52 +207,6 @@ Optional:
* none


##Data types for experiment attributes

SRA databases (ENA , NCBI, DDBJ) takes experiment records with a set of attributes. Each attribute has a name and a value, and can also have units. In contrast with the [BioSamples](www.ebi.ac.uk/biosamples) database, they do not have direct support for ontology terms.
The following section describe the expectations for each data type within FAANG.

###date

Dates should be reported in the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format, YYYY-MM-DD. To ensure clarity, the format should be reported as the 'units'.

###number

A number, with units specified. BioSamples recommends that units are given without abbreviations. For example, a birth weight could have a value of 1.3 and the units specified as 'kilograms'.

###protocol

A URL link to a protocol document on the FAANG FTP site. Please contact the [FAANG data coordination centre](mailto:[email protected]) to have your protocol documents added to the FTP site.

###text

Text, using US English spellings.

###URL

A URL, such as 'http://faang.org/'. Depending on the context, http, ftp, mailto links may be appropriate. Examples:

* ftp, ftp://ftp.faang.ebi.ac.uk/ftp/README
* http, http://faang.org/
* mailto, mailto:[email protected]

###location

A location should be reported as using three attributes:

* `location` (*text*) name of the location
* `location latitude` (*number*) latitude in decimal degrees. Units should be reported as 'decimal degrees'
* `location longitude`(*number*) longitude in decimal degrees. Units should be reported as 'decimal degrees'


###ontology term

The text label of a term from an ontology. The attribute value should be the term label. Unlike for sample submissions, direct links to ontologies cannot be submitted as attributes. The attribute value should exactly match the term name in the ontology.

###BioSample ID

BioSample IDs are in the form SAMEA2821491. They must be used when linking the experiment to the sample record.

##Missing data

Where data cannot be included in a submission, submit one of these text values instead
Expand Down
53 changes: 3 additions & 50 deletions docs/faang_sample_metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ Most requirements are laid out like this:

* `attribute name` (*data type*) a brief description

The data types are described later in this document.
The details of data types can be found [here](faang_data_type.md).

[BioSamples](http://www.ebi.ac.uk/biosamples) takes sample records with a set of attributes. Each attribute has a name and a value. It can also have 'Units', or a 'Term Source' and a 'Term Source ID'. The Term Source and ID allow us to refer to entries in other databases or ontologies. This is fully described on the [BioSamples help pages](http://www.ebi.ac.uk/biosamples/help/st_scd.html).

### Common

Expand Down Expand Up @@ -196,55 +198,6 @@ Links to other records:

* `Derived from` (*sample*) sample name or BioSample ID for the sample or animal the cell line was derived from, where this is known and can be described within the FAANG standards (optional).

## Data types for sample attributes

[BioSamples](http://www.ebi.ac.uk/biosamples) takes sample records with a set of attributes. Each attribute has a name and a value. It can also have 'Units', or a 'Term Source' and a 'Term Source ID'. The Term Source and ID allow us to refer to entries in other databases or ontologies. This is fully described on the [BioSamples help pages](http://www.ebi.ac.uk/biosamples/help/st_scd.html). The following section describes the expectations for each data type within FAANG.

### date

Dates should be reported in an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format, YYYY-MM-DD for dates or YYYY-MM for months. To ensure clarity, the format must be reported as the 'units'.

### NCBI taxon ID

A species name and identifier from the [NCBI Taxonomy database](http://www.ncbi.nlm.nih.gov/taxonomy). For example, [human](http://www.ncbi.nlm.nih.gov/taxonomy/9606) would be described in the term with value of 'Homo sapiens', term source as 'NCBI Taxonomy' and term source ID as 9606.

### number

A number, with units specified. BioSamples recommends that units are given without abbreviations .For example, a birth weight could have a value of 1.3 and the units specified as 'kilograms'.

### protocol

A URL link to a protocol document on the FAANG FTP site. Please contact the [FAANG data coordination centre](mailto:[email protected]) to have your protocol documents added to the FTP site.

### text

Text, using US English spellings.

### URL

A URL, such as 'http://faang.org/'. Depending on the context, http, ftp, mailto links may be appropriate. Examples:

* ftp, ftp://ftp.faang.ebi.ac.uk/ftp/README
* http, http://faang.org/
* mailto, mailto:[email protected]


### ontology term

A reference to an ontology term. The attribute value should be the term label. The term source should be the ontology used, and the term source ID should be an ID from that ontology. For example, cerebral cortex could be described with a term source of 'UBERON', a term source ID of 'UBERON:0000956' and a value of 'cerebral cortex'.

### location

A location should be reported as using three attributes:

* `location` (*text*) name of the location
* `location latitude` (*number*) latitude in decimal degrees. Units should be reported as 'decimal degrees'
* `location longitude`(*number*) longitude in decimal degrees. Units should be reported as 'decimal degrees'

### sample

Samples can be referred to in two ways. If the sample you need to reference is in the submission, use the sample name. If the sample was already submitted, use the BioSample ID (e.g. SAMEA2821491).

## Missing data

Where data cannot be included in a submission, submit one of these text values instead
Expand Down

0 comments on commit ea22691

Please sign in to comment.