GitHub - Parks-Laboratory/rqtl_pipeline

Synopsis

The R/QTL Mapping Pipeline is a collection of scripts that streamline the process of building input files for Karl Broman's quantitative trait loci analysis package for R. The scripts connect to a database containing genotype data, filter markers using PLINK, and finally use this subset of markers along with some phenotype data to build csvsr-formatted input files for R/QTL.

Outline

make R/QTL phenotype input --> make PLINK inputs --> run PLINK --> make R/QTL genotype input --> perform R/QTL mapping

Usage

place a copy of run_pipeline.cmd in directory containing file with phenotype data
set parameters in run_pipeline.cmd
(Optional) if doing batch mapping with UW-Madison's Condor HTC cluster, place a copy of make_rdata.r in same directory as run_pipeline.cmd and specify what mapping jobs to do in make_rdata.r (see comments in make_rdata.r for details on specifying mapping jobs and map.r for general mapping information.)
execute run_pipeline.cmd in Windows Command Prompt* by simply typing
```
 run_pipeline.cmd
```
* Note: Shift+Right-click inside a directory in File Explorer and select "Open command window here" to start Windows Command Prompt in current directory

For interactive mapping:

open interactive_mapping/[rqtl_mapping.r] (interactive_mapping/README.md) with RStudio or RGui
after loading data into a cross object, choose blocks of code to run

For batch mapping on UW-Madison Cluster:

see documentation

Summary of primary scripts

run_pipeline.cmd
- this is the backbone of the pipeline. It makes calls to scripts in the sub-directories and to make_rdata.r
- produces directory in which all generated files (including files for R/QTL mapping) are placed
make_rdata.r
- outputs file for performing mapping on Condor HTC cluster
- called by run_pipeline.cmd
filter_markers/make_plink_inputs.py
- outputs input files for PLINK which can then filter markers down to a sub-set that meet specified conditions (e.g. allele frequency, maximum missing rate)
- called by run_pipeline.cmd
make_rqtl_inputs/src/[make_rqtl_inputs.py] (make_rqtl_inputs/README.md)
- outputs files with genotype and phenotype information in the csvsr format specified by R/QTL
- called by run_pipeline.cmd
[batch_mapping/*] (batch_mapping/README.md)
- collection of scripts which perform R/QTL mapping on UW-Madison CHTC cluster
interactive_mapping/[rqtl_mapping.r] (interactive_mapping/README.md)
- R script with commonly used mapping commands, for use in R interactive session

Requirements

The following programs should be installed and exist in the Windows PATH environment variable

[PLINK 1.9] (https://www.cog-genomics.org/plink2)
[Python 3.X] (https://www.python.org/) (tested on Python 3.5)
[R] (https://cran.r-project.org/) (tested on 3.2.4)

Required Python modules:

[PYODBC] (https://mkleehammer.github.io/pyodbc/)

Install python modules from Windows Command Prompt via:

python -m pip install SomeModule

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synopsis

Outline

Usage

Summary of primary scripts

Requirements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 278 Commits
batch_mapping		batch_mapping
filter_markers		filter_markers
interactive_mapping		interactive_mapping
make_rqtl_inputs		make_rqtl_inputs
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
README_RQTL_MAPPING.md		README_RQTL_MAPPING.md
README_RUN_PIPELINE.md		README_RUN_PIPELINE.md
make_rdata.r		make_rdata.r
run_pipeline.cmd		run_pipeline.cmd

Parks-Laboratory/rqtl_pipeline

Folders and files

Latest commit

History

Repository files navigation

Synopsis

Outline

Usage

Summary of primary scripts

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages