This repository contains general notes and scripts for loading data into T3 breeDBase instances.
- Clone SGN and instance repositories for new instance
cd /opt/breedbase/repos
# Clone the repos
git clone [email protected]:TriticeaeToolbox/sgn ./sgn_avena
git clone [email protected]:TriticeaeToolbox/avena ./avena
# Switch to the SGN t3/master branch
cd sgn_avena
git checkout t3/master
- Create instance config file and modify contents
cp /opt/breedbase/config/triticum.conf /opt/breedbase/config/avena.conf
- Set
dbname
=cxgn_avena
- Update path names
s/triticum/avena/g
- Set
main_production_site_url
=https://oat.triticeaetoolbox.org
- Update ontology variables
onto_root_namespaces
,trait_cv_name
,trait_variable_onto_root_namespace
- Create docker mount points
cd /opt/breedbase/mnt
mkdir avena
mkdir avena/archive avena/prod avena/public
- Add instance to
/opt/breedbase/bin/breedbase
script
- Set instance-specific paths and variables
- Setup NGINX configuration
cp /etc/nginx/sites-available/wheat /etc/nginx/sites-available/avena
ln -s /etc/nginx/sites-available/avena /etc/nginx/sites-enabled/avena
- Set
server_name
=https://oat.triticeaetoolbox.org
- Modify http redirect
return
URL - Modify port in
proxy_pass
- Remove
ssl_certificate
andssl_certificate_key
directives (added bycertbot
)
# Get SSL certificates via certbot and Let's Encrypt
certbot --nginx
- Create database based off of the
breedbase
template
CREATE DATABASE cxgn_avena WITH TEMPLATE breedbase;
- Set permissions for
web_usr
- See
/sql/web_usr_grants.sql
for permissions
- Run Database Patches
# From avena web docker container
cd /home/production/cxgn/sgn/db
export PERL_USE_UNSAFE_INC=1 # Allow the current directory in INC
./run_all_patches.pl -u postgres -p "postgres_pass" -e admin -h breedbase_db -d cxgn_avena
If the node JS modules don't install properly:
Example: if there is a permission error during the build process when performing a git clone
- Install NodeJS on the docker host machine, if not already The web docker container currently has node version 10 installed
# From the docker host
curl -sL https://deb.nodesource.com/setup_10.x | bash -
apt-get install nodejs
- Install the node packages from the docker host machine These will get mounted to the SGN repo via Docker
# From the docker host, DO NOT RUN AS ROOT
cd {BB_HOME}/repos/{SGN_REPO}/js
rm -rf node_modules
npm cache clean -f
npm install
- Restart the web container
Follow the workflow described in ontology-scripts/WORKFLOW.md
- If the CV
composable_cvtypes
is missing, make sure the DB patch00073
has been run.
There are two perl scripts to aid the bulk loading of breeding programs.
The first ./bin/breeding_programs/add_breeding_program.pl
script can be used to add a single Breeding Program
to the specified database with a specified name and description.
# Add a single breeding program to the database
./bin/breeding_programs/add_breeding_program.pl -H localhost -D cxgn_avena -U postgres -P postgrespass -n "Test Breeding Program" -d "This is a Breeding Program used for testing"
The second ./bin/breeding_programs/add_breeding_programs_bulk.pl
script can be used to parse a CSV file containing
Breeding Program information (designed to use the T3 contributing data programs table). Each row of a breeding
program's information will added to the database if the name does not yet exist.
Example CSV file:
# From ./data/breeding_programs/oat_breeding_programs.csv
Breeding Program,Code,Collaborator,Description,Institution
"AAES, Auburn University",AUB,Kathryn Glass,"Alabama Agricultural Experiment Station (AAES), Auburn University, AL-USA.",Auburn University
AAFC Agassiz,ABC,,"Agriculture and Agri-Food Canada (AAFC) Pacific Agri-Food Research Centre (PARC) in Agassiz, BC-CAN.",Agriculture and Agri-Food Canada
AAFC Brandon,MTB,Jennifer W. Mitchell-Fetch,"Agriculture and Agri-Food Canada (AAFC) Brandon Research Centre, MB-CAN.",Agriculture and Agri-Food Canada
# Parse a CSV file of breeding program information
./bin/breeding_programs/add_breeding_programs_bulk.pl -H localhost -D cxgn_avena -U postgres -P postgrespass -d ./data/breeding_programs/oat_breeding_programs.csv
Accessions can be bulk loaded using an Excel template through the web site (Manage > Accessions).
The breedbase R package has helper functions to help create
a breedbase accession template. The R script ./bin/accessions/add_accessions.R
contains the createAccessions()
function which will read the T3 line information file and create a list of Accessions. These Accessions can be
passed to the breedbase::writeAccessionTemplate()
function to create the breedbase accession template file(s).
# Create Accessions from the T3 line information file
a <- createAccessions(lines="./data/accessions/oat_line_records.csv", programs="./data/breeding programs/oat_breeding_programs.csv", genus="Avena")
# Create a breedbase acccession template table
t <- buildAccessionTemplate(a)
# Write the breedbase accession template table to files
writeAccessionTemplate(t, output="./templates/accessions/oat_accessions.xls", chunk=6000)
Once created, the accession template file(s) can be uploaded to the breedbase website.
Pedigrees can be stored in two different ways:
-
The official Breedbase way is to assign two parents to each Accession entry. Each of the parents, in turn, need to be Accession entries themselves. This method allows the interactive pedigree viewer to work.
-
The simplified T3 way is to add a Purdy pedigree string as the pedigree property of the Accession. The pedigree property is not parsed by the database and is displayed on the Stock detail page. This method allows a user to search for Accessions on the text of the pedigree string.
Add Pedigrees the Breedbase way:
A pedigree file (a spreadsheet containing a table of Accessions and their parent names) can be uploaded through the website from the Manage > Accessions page. The breedbase R package has helper functions to help create the spreadsheet template.
Add Pedigrees the T3 way:
Purdy Pedigree strings no longer have to be added to the database separately. The T3 breedbase instances support
the Accession properties of purdy_pedigree
and filial_generation
and can be added as an Accession property
directly from the Accession upload template. The breedbase R package has helper functions to help create the Accession
upload template.
First, the pedigree stock property needs to be added to the database. The add_stockprop_term.pl
perl script can be used:
perl ./bin/accessions/add_stockprop_term.pl -H localhost -D cxgn_avena -n pedigree -d "Purdy pedigree string of an Accession"
Then, a file (following the T3 bulk line information format):
# pedigrees.csv
Name Species GRIN Synonym Breeding Program Parent1 Parent2 Pedigree Description
1 02-18228 Pio25R26/ 9634-24437//95-4162
2 02-194638-1 Patton / Cardinal // 96-2550
can be used to define the pedigrees for each of the Accessions. The add_stock_pedigrees.pl
perl script can be used to add the pedigrees as stock properties of the Accessions:
perl ./bin/accessions/add_stock_pedigrees.pl -H localhost -D cxgn_avena -d pedigrees.csv
Finally, in order for the pedigree property to be displayed on the Stock detail page, the editable_stock_props
configuration variable (sgn_local.conf
) needs to have pedigree
appended to it.
To generate the sql statements for all of the permissions for the web_usr
:
pg_dump -h localhost -U postgres -d cxgn_triticum_test --schema-only | grep -e '^\(ALTER\|GRANT\) .* TO web_usr;$' > grants.sql