-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #75 from databio/dev
Change default deduplication tool and improve messaging
- Loading branch information
Showing
188 changed files
with
44,002 additions
and
266 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
title: "PEPATAC BiocProject" | ||
author: "Michal Stolarczyk" | ||
date: "`r Sys.Date()`" | ||
output: rmarkdown::html_vignette | ||
vignette: > | ||
%\VignetteIndexEntry{Vignette Title} | ||
%\VignetteEngine{knitr::rmarkdown} | ||
%\VignetteEncoding{UTF-8} | ||
--- | ||
|
||
|
||
```{r setup, include = FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
# Introduction | ||
|
||
Before you start see the [Getting started with `BiocProject` vignette](http://code.databio.org/pepr/articles/gettingStarted.html) for the basic `BiocProejct` information and installation instructions and [`PEPATAC` website](http://code.databio.org/PEPATAC/) for information regarding this ATAC-seq pipeline. | ||
|
||
# Read the results of `PEPATAC` | ||
|
||
The function shown below reads in the [`BED` files](https://genome.ucsc.edu/FAQ/FAQformat.html) from the `output_dir` specified in the [PEP](https://pepkit.github.io/docs/simple_example/) (precisely: YAML config file). | ||
|
||
```{r include=FALSE, eval=TRUE} | ||
processFunction = "readPepatacPeakBeds.R" | ||
source(processFunction) | ||
``` | ||
```{r echo=FALSE, comment=""} | ||
readPepatacPeakBeds | ||
``` | ||
|
||
Get the project config | ||
```{r echo=T,message=FALSE} | ||
library(BiocProject) | ||
ProjectConfig = "gold_hg19.yaml" | ||
``` | ||
## Create the `BiocProject` object | ||
|
||
```{r} | ||
bp = BiocProject(file=ProjectConfig) | ||
``` | ||
|
||
## Get the read data | ||
|
||
```{r} | ||
data = getData(bp) | ||
``` | ||
It is packed into a nested list, so to access the specific elements run, e.g.: | ||
```{r} | ||
data[[1]]$gold1 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
sample_name,sample_description,treatment_description,organism,protocol,data_source,SRR,SRX,Sample_geo_accession,Sample_series_id,read_type,Sample_instrument_model,read1,read2 | ||
gold1,ATAC-seq from dendritic cell (ENCLB065VMV),Homo sapiens dendritic in vitro differentiated cells treated with 0 ng/mL Lipopolysaccharide for 0 hours,human,ATAC-seq,SRA,SRR5210416,SRX2523872,GSM2471255,GSE94182,PAIRED,Illumina HiSeq 2000,SRA_1,SRA_2 | ||
gold2,ATAC-seq from dendritic cell (ENCLB811FLK),Homo sapiens dendritic in vitro differentiated cells treated with 0 ng/mL Lipopolysaccharide for 0 hours,human,ATAC-seq,SRA,SRR5210450,SRX2523906,GSM2471300,GSE94222,PAIRED,Illumina HiSeq 2000,SRA_1,SRA_2 | ||
gold3,ATAC-seq from dendritic cell (ENCLB887PKE),Homo sapiens dendritic in vitro differentiated cells treated with 0 ng/mL Lipopolysaccharide for 0 hours,human,ATAC-seq,SRA,SRR5210398,SRX2523862,GSM2471249,GSE94177,PAIRED,Illumina NextSeq 500,SRA_1,SRA_2 | ||
gold4,ATAC-seq from dendritic cell (ENCLB586KIS),Homo sapiens dendritic in vitro differentiated cells treated with 0 ng/mL Lipopolysaccharide for 0 hours,human,ATAC-seq,SRA,SRR5210428,SRX2523884,GSM2471269,GSE94196,PAIRED,Illumina HiSeq 2000,SRA_1,SRA_2 | ||
gold5,ATAC-seq from dendritic cell (ENCLB384NOX),Homo sapiens dendritic in vitro differentiated cells treated with 0 ng/mL Lipopolysaccharide for 0 hours,human,ATAC-seq,SRA,SRR5210390,SRX2523854,GSM2471245,GSE94173,PAIRED,Illumina HiSeq 2000,SRA_1,SRA_2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Run gold standard samples through ATACseq pipeline. | ||
name: gold_hg19 | ||
|
||
metadata: | ||
sample_annotation: "$PROCESSED/gold/pepatac/hg19/gold_atac_annotation.csv" | ||
output_dir: "$PROCESSED/gold/pepatac/hg19/10_08_18_wo" | ||
pipeline_interfaces: "$CODE/pepatac/pipeline_interface.yaml" | ||
|
||
derived_columns: [read1, read2] | ||
|
||
data_sources: | ||
SRA_1: "${SRAFQ}{SRR}_1.fastq.gz" | ||
SRA_2: "${SRAFQ}{SRR}_2.fastq.gz" | ||
|
||
implied_columns: | ||
organism: | ||
human: | ||
genome: hg19 | ||
macs_genome_size: hs | ||
|
||
bioconductor: | ||
read_fun_name: readPepatacPeakBeds | ||
read_fun_path: readPepatacPeakBeds.R |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
readPepatacPeakBeds = function(project) { | ||
# define default column names in GenomicRanges::GRanges objects | ||
DEFAULT_GRANGES_COLS = c('chr', 'start', 'end') | ||
# inferring the suffix, which is "peak_calling_" + genome_assembly, | ||
# see: pepatac.py | ||
if (length(unique(samples(p)$genome)) != 1) | ||
stop(paste0("Need one genome assembly, got ", | ||
length(unique(samples(p)$genome)), | ||
".\nCouldn't infer the path to the files.")) | ||
genome_assembly = unique(samples(project)$genome) | ||
suffix = paste0("peak_calling_", genome_assembly) | ||
# inferring prefix, which is "restults_pipeline", | ||
# if not profided in PEP config, see: python peppy package | ||
prefix = ifelse(is.null(config(project)$metadata$results_subdir), | ||
"results_pipeline", config(project)$metadata$results_subdir) | ||
# get output directory from PEP | ||
outputDir = config(project)$metadata$output_dir | ||
# get sample names from PEP | ||
samples_names = samples(project)$sample_name | ||
# read the data for each sample | ||
result = lapply(samples_names, function(sample) { | ||
# use the provided arguments to construct the path | ||
dir = file.path(outputDir, prefix, sample, suffix) | ||
# find BED files in the path | ||
bedFiles = list.files(path=dir, pattern="*.bed") | ||
# get absolute paths to the BED files | ||
bedFilesAbs = file.path(dir,bedFiles) | ||
gr = list() | ||
# for eache BED file for each sample | ||
message("reading ",length(bedFiles)," files for sample: ", sample) | ||
for (i in seq_along(bedFilesAbs)) { | ||
# read BED file | ||
df = read.table(bedFilesAbs[i]) | ||
# since the number of columns varies, name the first 3 as default and | ||
# the rest metadataX | ||
colNames = append( | ||
DEFAULT_GRANGES_COLS, | ||
paste0("metadata", seq(1,NCOL(df)-length(DEFAULT_GRANGES_COLS)))) | ||
colnames(df) = colNames | ||
# convert the data.frame to GenomicRanges::GRanges object | ||
gr[[i]] = GenomicRanges::GRanges(df) | ||
} | ||
names(gr) = bedFiles | ||
return(gr) | ||
}) | ||
names(result) = samples_names | ||
return(result) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,5 +14,5 @@ Pull requests welcome. Active development should occur in a development or featu | |
* Nathan Sheffield, [email protected] | ||
* Jason Smith, [email protected] | ||
* Ryan Corces, [email protected] | ||
* Vince Reuter, vince.reuter@gmail.com | ||
* Vince Reuter, vreuter@protonmail.com | ||
* Others... (add your name) |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,11 @@ | ||
# Pull base image | ||
FROM phusion/baseimage:0.10.1 | ||
FROM phusion/baseimage:0.10.2 | ||
|
||
# Who maintains this image | ||
LABEL maintainer Jason Smith "[email protected]" | ||
|
||
# Version info | ||
LABEL version 0.8.5 | ||
LABEL version 0.9.1 | ||
|
||
# Use baseimage-docker's init system. | ||
CMD ["/sbin/my_init"] | ||
|
@@ -48,6 +48,7 @@ RUN pip install virtualenv && \ | |
RUN DEBIAN_FRONTEND=noninteractive apt-get --assume-yes install r-base r-base-dev && \ | ||
echo "r <- getOption('repos'); r['CRAN'] <- 'http://cran.us.r-project.org'; options(repos = r);" > ~/.Rprofile && \ | ||
Rscript -e "install.packages('argparser')" && \ | ||
Rscript -e "install.packages('data.table')" && \ | ||
Rscript -e "install.packages('devtools')" && \ | ||
Rscript -e "devtools::install_github('pepkit/pepr')" && \ | ||
Rscript -e "install.packages('data.table')" && \ | ||
|
@@ -62,7 +63,6 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get --assume-yes install r-base r-base-de | |
Rscript -e "install.packages('scales')" && \ | ||
Rscript -e "install.packages('stringr')" | ||
|
||
|
||
# Install bedtools | ||
RUN DEBIAN_FRONTEND=noninteractive apt-get install --assume-yes \ | ||
ant \ | ||
|
@@ -104,10 +104,12 @@ RUN wget https://downloads.sourceforge.net/project/bowtie-bio/bowtie2/2.3.4.1/bo | |
make install && \ | ||
ln -s /home/src/bowtie2-2.3.4.1/bowtie2 /usr/bin/ | ||
|
||
# Install picard | ||
WORKDIR /home/tools/bin | ||
RUN wget https://github.com/broadinstitute/picard/releases/download/2.18.0/picard.jar && \ | ||
chmod +x picard.jar | ||
# Install samblaster | ||
WORKDIR /home/tools/ | ||
RUN git clone git://github.com/GregoryFaust/samblaster.git && \ | ||
cd /home/tools/samblaster && \ | ||
make && \ | ||
ln -s /home/tools/samblaster/samblaster /usr/bin/ | ||
|
||
# Install UCSC tools | ||
WORKDIR /home/tools/ | ||
|
@@ -135,6 +137,11 @@ RUN git clone git://github.com/relipmoc/skewer.git && \ | |
make install | ||
|
||
# OPTIONAL REQUIREMENTS | ||
# Install picard | ||
WORKDIR /home/tools/bin | ||
RUN wget https://github.com/broadinstitute/picard/releases/download/2.18.0/picard.jar && \ | ||
chmod +x picard.jar | ||
|
||
# Install F-seq | ||
WORKDIR /home/src/ | ||
RUN wget https://github.com/aboyle/F-seq/archive/master.zip && \ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# [PEPATAC documentation](http://code.databio.org/PEPATAC) | ||
|
||
This repository is viewable at [code.databio.org/PEPATAC](http://code.databio.org/PEPATAC). It holds HTML documentation for the PEPATAC pipeline. | ||
|
||
## Building PEPATAC documentation with jekyll: | ||
|
||
`jekyll build pepatac` | ||
|
||
## Do it with `docker` or `singularity`! | ||
|
||
1. Grab the container | ||
|
||
`docker pull nsheff/jim` | ||
*or* | ||
`singularity build jim docker://nsheff/jim` | ||
|
||
2. Build the website | ||
|
||
`docker run jim jekyll build pepatac` | ||
*or* | ||
`singularity exec jim jekyll build pepatac` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
name: PEPATAC | ||
title: PEPATAC | ||
url: "http://code.databio.org/PEPATAC" | ||
baseurl: "" | ||
include: ['pages', "howto", "assets"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
<hr> | ||
<footer> | ||
<div class="container"> | ||
<ul id="contact"> | ||
<li><a href="{{ "/contact/" | prepend: site.baseurl }}"><span class="far fa-envelope"></span> Contact Us</a></li> | ||
<li><a href="http://databio.org">Learn more about the Databio team!</a></li> | ||
</ul> | ||
</div> | ||
</footer> | ||
<!-- JavaScript --> | ||
<!-- jQuery first, then Popper.js, then Bootstrap JS --> | ||
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script> | ||
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.3/umd/popper.min.js" integrity="sha384-ZMP7rVo3mIykV+2+9J3UJ46jBk0WLaUAdn689aCwoqbBJiSnjAK/l8WvCWPIPm49" crossorigin="anonymous"></script> | ||
<script src="{{ "/assets/js/bootstrap.min.js" | prepend: site.baseurl }}" ></script> | ||
<script src="{{ "/assets/js/bootstrap-toc.js" | prepend: site.baseurl }}" ></script> | ||
<script src="{{ "/assets/js/clipboard.js" | prepend: site.baseurl }}" ></script> | ||
<script src="{{ "/assets/js/prism.js" | prepend: site.baseurl }}" ></script> | ||
<script> | ||
$(function () { | ||
$('.tree li:has(ul)').addClass('parent_li').find(' > span').attr('title', 'Collapse this branch'); | ||
$('.tree li.parent_li > span').on('click', function (e) { | ||
var children = $(this).parent('li.parent_li').find(' > ul > li'); | ||
if (children.is(":visible")) { | ||
children.hide('fast'); | ||
$(this).attr('title', 'Expand this branch').find(' > i').addClass('icon-plus-sign').removeClass('icon-minus-sign'); | ||
} else { | ||
children.show('fast'); | ||
$(this).attr('title', 'Collapse this branch').find(' > i').addClass('icon-minus-sign').removeClass('icon-plus-sign'); | ||
} | ||
e.stopPropagation(); | ||
}); | ||
}); | ||
</script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
<!-- Bootstrap stylesheet --> | ||
<link rel="stylesheet" type="text/css" href="{{ "/assets/css/bootstrap.min.css" | prepend: site.baseurl }}" > | ||
<!-- Bootstrap ToC --> | ||
<link rel="stylesheet" type="text/css" href="{{ "/assets/css/bootstrap-toc.css" | prepend: site.baseurl }}" > | ||
<!-- Tree --> | ||
<link rel="stylesheet" type="text/css" href="{{ "/assets/css/tree.css" | prepend: site.baseurl }}" > | ||
<!-- Prism syntax highlighting --> | ||
<link rel="stylesheet" type="text/css" href="{{ "/assets/css/prism.css" | prepend: site.baseurl }}" > | ||
<!-- FontAwesome --> | ||
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.3.1/css/all.css" integrity="sha384-mzrmE5qonljUremFsqc01SB46JvROS7bZs3IO2EmfFsd15uHvIt+Y8vEf7N7fWAU" crossorigin="anonymous"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
<nav class="navbar sticky-top navbar-expand-lg navbar-dark bg-dark" style="z-index: 100000"> | ||
<a class="navbar-left" href="#top"><img src="{{ "/assets/images/logo_pepatac_white.png" | prepend: site.baseurl }}" class="d-inline-block align-middle img-responsive" alt="PEPATAC" style="max-height:20px; margin-top:-10px; margin-bottom:-10px"></a> | ||
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarPrimary" aria-controls="navbarPrimary" aria-expanded="false" aria-label="Toggle navigation"> | ||
<span class="navbar-toggler-icon"></span> | ||
</button> | ||
<div class="collapse navbar-collapse" id="navbarPrimary"> | ||
<ul class="navbar-nav mr-auto"> | ||
<li class="nav-item active"> | ||
<a class="nav-link" href="{{ "/" | prepend: site.baseurl }}">Home<span class="sr-only">(current)</span></a> | ||
</li> | ||
<li class="nav-item dropdown"> | ||
<a class="nav-link dropdown-toggle" href="#" id="get-started-Dropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="fas fa-play-circle"></span> Getting started</a> | ||
<div class="dropdown-menu" aria-labelledby="get-started-Dropdown"> | ||
<a class="dropdown-item" href="{{ "/intro/" | prepend: site.baseurl }}">Introduction</a> | ||
<a class="dropdown-item" href="{{ "/features/" | prepend: site.baseurl }}">Features and benefits</a> | ||
<a class="dropdown-item" href="{{ "/install/" | prepend: site.baseurl }}">Install and run test example</a> | ||
<a class="dropdown-item" href="{{ "/tutorial/" | prepend: site.baseurl }}">Extended tutorial</a> | ||
<a class="dropdown-item" href="{{ "/glossary/" | prepend: site.baseurl }}">Glossary</a> | ||
</div> | ||
</li> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="{{ "/howto/" | prepend: site.baseurl }}"><span class="fas fa-chalkboard-teacher"></span> How-to guides</a> | ||
</li> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="{{ "/assets/files/examples/gold/summary.html" | prepend: site.baseurl }}" rel="noopener noreferrer" target="_blank"><span class="fas fa-desktop"></span> Example output</a> | ||
</li> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="https://github.com/databio/pepatac"><span class="fab fa-github fa-lg"></span> GitHub</a> | ||
</li> | ||
</ul> | ||
<ul class="navbar-nav navbar-right"> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="http://databio.org/">Databio.org</a> | ||
</li> | ||
<li class="nav-item"> | ||
<a class="nav-link" href="http://databio.org/software/">Software & Data</a> | ||
</li> | ||
</ul> | ||
</div> | ||
</nav> |
Oops, something went wrong.