Skip to content

Data dictionary for Study Schema

darla-leafnode edited this page May 6, 2019 · 43 revisions

Introduction

Study module (also known as DMS - Data Management System) of BMS is based on Chado natural diversity module. Here we will focus on key table structures and how they are used in BMS.

It is highly recommended that following conceptual and background information documents be read first:

project Table

Column Name Description Type Size (bytes)
project_id DB Record ID of a dataset. One study (Nursery/Trial) comprises of three datasets (STUDY, PLOT and ENVIRONMENT). Integer 4
name Name of the dataset. Text 255
description Description of the dataset Text 255
program_uuid Unique ID of the program record in workbench_project table in workbench database. Text 36
deleted Indicates if the project is deleted. tinyint 11
start_date Start date of the project. Varchar 8
end_date End date of the project. Varchar 8
study_update Date of the update of the project. Varchar 8
objective Objective of the project. Varchar 255
created_by Id of the creator of the project. Varchar 255
study_type Type of the project. int 11

Following example shows the three (dataset) rows in project table that are inserted when a trial (named T1) is created in BMS:

+------------+----------------+----------------+--------------------------------------+-------+----------+--------+--------+-----------+-----------+-----------+
| project_id | name           | description    | program_uuid                         |deleted|start_date|end_date|study_update|objective|created_by|study_type|
+------------+----------------+----------------+--------------------------------------+-------+----------+--------+--------+-----------+-----------+-----------+
|      25007 | T1             | T1             | 36ad2439-9e16-47ae-bf72-355be4c1a764 |0      |20160201|20180901|20180901|objective_test | 1      |   6     |
|      25008 | T1-ENVIRONMENT | T1-ENVIRONMENT | 36ad2439-9e16-47ae-bf72-355be4c1a764 |0      |NULL    |NULL    |NULL    |NULL           |NULL    | NULL    |
|      25009 | T1-PLOTDATA    | T1-PLOTDATA    | 36ad2439-9e16-47ae-bf72-355be4c1a764 |0      |NULL    |NULL    |NULL    |NULL           |NULL    | NULL    |
+------------+----------------+----------------+--------------------------------------+-------+----------+--------+--------+-----------+-----------+-----------+
***

projectprop Table

Stores name/value pair style properties for each of the dataset in project table.

Column Name Description Type Size (bytes)
projectprop_id Primary key. DB record id. Integer 4
project_id Foreign key to the project table record of the dataset. Integer 4
type_id Foreign key to the cvterm table. Represents the "name" part of the name/value property. Integer 4
value Value of the property Text 255
rank One property is described in this table with multiple rows. Rank is used to group related properties. Integer 4
variable_id Id of the variable in cvterm Integer 11
alias Alias of the variable Varchar 255

listdata_project Table

Stores "snapshot" of the Germplasm list entries that are used in the study. Similar to listdata table of the List Management module. This is a copy made to preserve the list entries which were part of the germplasm list when it was selected in Nursery/Trial. This is to ensure that when original Germplasm list is modified in anyway through list manager, the list entries used in Nursery/Trial are not affected.

Column Name Description Type Size (bytes)
listdata_project_id Primary key. DB record id. Integer 4
list_id Foreign key to the listnms record with metadata about the list. Integer 4
germplasm_id Foreign key to the germplsm table, the GID of the list entry. Integer 4
check_type Foreign key to the cvterm record describing the type of entry (test entry/check entry). Integer 4
entry_id Entry id. Unique number within a list. Integer 4
entry_code Entry code. Textual id of the list entry. Text 47
seed_source Seed source Text 255
designation Designation/name of the Germplasm Text 255
group_name Cross name of the germplasm Text 255
duplicate_notes Used to list all the duplicates and reciprocals (plot or pedigree level). It is populated when you create crosses. This is useful in determining which of the stocks can be bulked together. Text 200
crossing_date Date of the Cross Integer 11
notes Notes Varchar 200

project_relationship Table

Defines relationships for the three dataset entries in project table.

Column Name Description Type Size (bytes)
project_relationship_id Primary key. DB record id. Integer 4
subject_project_id Subject part of the relationship. Integer 4
object_project_id Object part of the relationship. Integer 4
type_id Foreign key to the cvterm table which defines the type of relationship Integer 4

nd_experiment Table

From Chado wiki:

This is the core table for the natural diversity module, representing each individual assay that is undertaken (nb this is usually not an entire experiment). Each nd_experiment should give rise to a single genotype or phenotype and be described via 1 (or more) protocols. Collections of assays that relate to each other should be linked to the same record in the project table. Experiment.type is a cvterm that will define which records are expected for other tables. Any CV may be used but it was designed with terms such as: [phenotype_assay, genotype_assay, field_collection, cross_experiment] in mind.

Each row in the table represents one observation unit in a study against which many measurements/observations are recorded. Equates to a plot.

Column Name Description Type Size (bytes)
nd_experiment_id Primary key. DB record id. Integer 4
nd_geolocation_id Foreign key joining to the nd_geolocation table. One study can have many geo locations e.g. in multi-location international trials. Integer 4
type_id Foriegn key to cvterm that describes the type of observation unit e.g. Plot, environment etc. Integer 4
obs_unit_id Observation Unit id. Varchar 36
project_id Foreign key to project table. Integer 11
stock_id Foreign key to stock table. Integer 11
parent_id Parent dataset Id. Integer 11
observation_unit_no Observation Unit Number. Integer

nd_experimentprop Table

Stores name/value pair style properties for each of experiment units in nd_experiment table.

Column Name Description Type Size (bytes)
nd_experimentprop_id Primary key. DB record id. Integer 4
nd_experiment_id Forieng key to nd_experiment table record for which this record defines a property. Integer 4
type_id Foreign key to the cvterm table. Represents the "name" part of the name/value property. Integer 4
value Value of the property Text 255
rank One property could be described in this table with multiple rows. Rank is used to group related properties. Integer 4

phenotype Table

This is where measurements, trait observations are stored for each observation unit in study.

Column Name Description Type Size (bytes)
phenotype_id Primary key. DB record id. Integer 4
uniquename Probably not used. Text 255
name Always same as observable_id in but text form. TODO insert explanation from Graham's email here. Text 255
observable_id Foreign key to cvterm table which represents the variable (trait) being measured. Integer 4
attr_id Probably not used Integer 4
value Value of the trait or characteristic observed. Text 255
cvalue_id Probably not used. Integer 4
assay_id Probably not used Integer 4
nd_experiment_id Foreign key to nd_experiment table Integer 11
created_date Creation Date TIMESTAMP
updated_date Modification Date TIMESTAMP

phenotype_outlier Table

Column Name Description Type Size (bytes)
phenotype_outlier_id Primary key. DB record id. Integer 4
phenotype_id Foreign key to the phenotype table Integer 4
value Value of the outlying data for the phenotype. Text 45
date_modified Date last modified. Integer 4

stock table

Stores information about the Germplasm material used in the study.

Column Name Description Type Size (bytes)
stock_id Primary key. DB record id. Integer 4
dbxref_id Foreign key to the germplsm table Integer 4
organism_id Probably unused. Integer 4
name Designation/name of the Germplasm stock used in study Text 255
uniquename GID of the germplasm. TODO Confirm. Text 255
value Probably unused.TODO Confirm. Text 255
description Probably unused.TODO Confirm. Text 255
type_id Foreign key to the cvterm table. TODO confirm usage. Integer 4
is_obsolete Probably unused. TODO Confirm. Tiny Int 1

nd_geolocation Table

Column Name Description Type Size (bytes)
nd_geolocation_id Primary key. DB record id. Integer 4
description Description of the geographic location Text 255
latitude Latitude. Float 4
longitude Longitude. Float 4
geodetic_datum Geodetic datum. Text 32
altitude Altitude. Float 4

nd_geolocationprop Table

Name/Value pair style propertied associated with a study location/instance.

Column Name Description Type Size (bytes)
nd_geolocationprop_id Primary key. DB record id. Integer 4
nd_geolocation_id Foreign key to the nd_geolocation table. Integer 4
type_id Foreign key to the cvterm table, representing the "name" part of the property Integer 4
value Value of the geolocation property Text 255
rank Groups related geo location properties Integer 4
Clone this wiki locally