Skip to content

Commit

Permalink
km bump pipeline and tool version numbers in docs (#1142)
Browse files Browse the repository at this point in the history
* fix typo in CEMBA READEME.md

* renamed Arrays overview and added slug

The Arrays Overview on the website required that you click into the drop-down to find, which wasn't the case for the rest of our pipelines, so I fixed it by changing the name of the file and adding a slug.

* update doc folder locations in READMEs

* update ATAC and CEMBA README files

* update CEMBA overview

* update cemba methods

* update exome overview

* Update README.md

---------

Co-authored-by: ekiernan <[email protected]>
  • Loading branch information
kayleemathews and ekiernan authored Dec 6, 2023
1 parent 3ecd64c commit 2ee5a81
Show file tree
Hide file tree
Showing 7 changed files with 25 additions and 24 deletions.
4 changes: 2 additions & 2 deletions website/docs/Pipelines/ATAC/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ slug: /Pipelines/ATAC/README

| Pipeline Version | Date Updated | Documentation Author | Questions or Feedback |
| :----: | :---: | :----: | :--------------: |
| [1.1.1](https://github.com/broadinstitute/warp/releases) | October, 2023 | Kaylee Mathews | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |
| [1.1.2](https://github.com/broadinstitute/warp/releases) | December, 2023 | Kaylee Mathews | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |

## Introduction to the ATAC workflow
ATAC is an open-source, cloud-optimized pipeline developed collaboration with members of the [BRAIN Initiative](https://braininitiative.nih.gov/) (BICCN and [BICAN](https://brainblog.nih.gov/brain-blog/brain-issues-suite-funding-opportunities-advance-brain-cell-atlases-through-centers) Sequencing Working Group) and [SCORCH](https://nida.nih.gov/about-nida/organization/divisions/division-neuroscience-behavior-dnb/basic-research-hiv-substance-use-disorder/scorch-program) (see [Acknowledgements](#acknowledgements) below). It supports the processing of 10x single-nucleus data generated with 10x Multiome [ATAC-seq (Assay for Transposase-Accessible Chromatin)](https://www.10xgenomics.com/products/single-cell-multiome-atac-plus-gene-expression), a technique used in molecular biology to assess genome-wide chromatin accessibility.
Expand Down Expand Up @@ -71,7 +71,7 @@ To see specific tool parameters, select the task WDL link in the table; then vie

| Task name and WDL link | Tool | Software | Description |
| --- | --- | --- | ------------------------------------ |
| [FastqProcessing as SplitFastq](https://github.com/broadinstitute/warp/blob/master/tasks/skylab/FastqProcessing.wdl) | fastqprocess | custom | Dynamically selects the correct barcode orientation, corrects cell barcodes and splits FASTQs into smaller FASTQs. The number of files output depends on either the bam_size parameter, which determines the size of the output FASTQs produced, or the num_output_files parameter, which determines the number of FASTQS that should be output. The smaller FASTQs are grouped by cell barcode with each read having the corrected (CB) and raw barcode (CR) in the read name. |
| [FastqProcessing as SplitFastq](https://github.com/broadinstitute/warp/blob/master/tasks/skylab/FastqProcessing.wdl) | fastqprocess | custom | Dynamically selects the correct barcode orientation, corrects cell barcodes, and splits FASTQ files. The number of files output depends on either the `bam_size` parameter, which determines the size of the output FASTQ files produced, or the `num_output_files` parameter, which determines the number of FASTQ files that should be output. The smaller FASTQ files are grouped by cell barcode with each read having the corrected (CB) and raw barcode (CR) in the read name. |
| [TrimAdapters](https://github.com/broadinstitute/warp/blob/master/pipelines/skylab/multiome/atac.wdl) | Cutadapt v4.4 | cutadapt | Trims adaptor sequences. |
| [BWAPairedEndAlignment](https://github.com/broadinstitute/warp/blob/master/pipelines/skylab/multiome/atac.wdl) | bwa-mem2 | mem | Aligns reads from each set of partitioned FASTQ files to the genome and outputs a BAM with ATAC barcodes in the CB:Z tag. |
| [Merge.MergeSortBamFiles as MergeBam](https://github.com/broadinstitute/warp/blob/master/tasks/skylab/MergeSortBam.wdl) | MergeSamFiles | Picard | Merges each BAM into a final aligned BAM with corrected cell barcodes in the CB tag. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
sidebar_position: 2
---

# CEMBA_v1.1.0 Publication Methods
# CEMBA_v1.1.5 Publication Methods

Below we provide a sample methods section for a publication. For the complete pipeline documentation, see the [CEMBA README](./README.md).

## Methods

Data processing was performed with the CEMBA v1.1.0 Pipeline (RRID:SCR_021219). Sequencing reads were first trimmed to remove adaptors using Cutadapt 1.18 with the following parameters in paired-end mode: `-f fastq -quality-cutoff 20 -minimum-length 62 -a AGATCGGAAGAGCACACGTCTGAAC -A AGATCGGAAGAGCGTCGTGTAGGGA`.

After trimming the adapters, an unaligned BAM (uBAM) for the trimmed R1 FASTQ was created using Picard v2.18.23.
After trimming the adapters, an unaligned BAM (uBAM) for the trimmed R1 FASTQ was created using Picard v2.26.10.

Cell barcodes were then extracted from the trimmed R1 FASTQ and tagged to the R1 uBAM with Single Cell Tools (sctools) v0.3.4a using a barcode whitelist as well as configurable barcode start positions and lengths.

Expand All @@ -20,8 +20,8 @@ The trimmed R1 and R2 reads were then aligned to mouse (mm10) or human (hg19) ge

After alignment, the output R1 and R2 BAMs were sorted in coordinate order and duplicates removed using the Picard MarkDuplicates REMOVE_DUPLICATE option. Samtools 1.9 was used to further filter BAMs with a minimum map quality of 30 using the parameter `-bhq 30`.

Methylation reports were produced for the filtered BAMs using Bismark. The barcodes from the R1 uBAM were then attached to the aligned, filtered R1 BAM with Picard. The R1 and R2 BAMs were merged with Samtools. Readnames were added to the merged BAM and a methylated VCF created using MethylationTypeCaller in GATK 4.1.2.0. The VCF was then converted to an additional ALLC file using a custom python script.
Methylation reports were produced for the filtered BAMs using Bismark. The barcodes from the R1 uBAM were then attached to the aligned, filtered R1 BAM with Picard. The R1 and R2 BAMs were merged with Samtools. Readnames were added to the merged BAM and a methylated VCF created using MethylationTypeCaller in GATK 4.3.0.0. The VCF was then converted to an additional ALLC file using a custom python script.

Samtools was then used to calculate coverage depth for sites with coverage greater than 1 and to create BAM index files. The final outputs included the barcoded aligned BAM, BAM index, a VCF with locus-specific methylation information, VCF index, ALLC file, and methylation reports.

An example of the pipeline and its outputs is available on [Terra](https://app.terra.bio/#workspaces/brain-initiative-bcdc/Methyl-c-seq_Pipeline). Examples of genomic reference files and other inputs can be found in the pipeline’s [example JSON](https://github.com/broadinstitute/warp/blob/develop/pipelines/cemba/cemba_methylcseq/example_inputs/CEMBA.inputs.json).
An example of the pipeline and its outputs is available on [Terra](https://app.terra.bio/#workspaces/brain-initiative-bcdc/Methyl-c-seq_Pipeline). Examples of genomic reference files and other inputs can be found in the pipeline’s [example JSON](https://github.com/broadinstitute/warp/blob/master/pipelines/cemba/cemba_methylcseq/example_inputs/CEMBA.inputs.json).
22 changes: 11 additions & 11 deletions website/docs/Pipelines/CEMBA_MethylC_Seq_Pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ slug: /Pipelines/CEMBA_MethylC_Seq_Pipeline/README

| Pipeline Version | Date Updated | Documentation Author | Questions or Feedback |
| :----: | :---: | :----: | :--------------: |
| [CEMBA_v1.1.0](https://github.com/broadinstitute/warp/releases) | February, 2021 | [Elizabeth Kiernan](mailto:[email protected]) | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |
| [CEMBA_v1.1.5](https://github.com/broadinstitute/warp/releases) | December, 2023 | [Elizabeth Kiernan](mailto:[email protected]) | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |

![CEMBA](./CEMBA.png)

Expand All @@ -28,7 +28,7 @@ Interested in using the pipeline for your publication? See the [“CEMBA publica
| Workflow Language | WDL 1.0 | [openWDL](https://github.com/openwdl/wdl) |
| Genomic Reference Sequence| GRCH38 and GRCM38 | [GENCODE](https://www.gencodegenes.org/) |
| Aligner | BISMARK v0.21.0 with --bowtie2 | [Bismark](https://www.bioinformatics.babraham.ac.uk/projects/bismark/) |
| Variant Caller | GATK 4.1.2.0 | [GATK 4.1.2.0](https://gatk.broadinstitute.org/hc/en-us)
| Variant Caller | GATK 4.3.0.0 | [GATK 4.3.0.0](https://gatk.broadinstitute.org/hc/en-us)
| Data Input File Format | File format in which sequencing data is provided | [Zipped FASTQs (.fastq.gz)](https://support.illumina.com/bulletins/2016/04/fastq-files-explained.html) |
| Data Output File Format | File formats in which CEMBA output is provided | [BAM](http://samtools.github.io/hts-specs/), [VCF](https://samtools.github.io/hts-specs/VCFv4.2.pdf), [ALLC](https://github.com/yupenghe/methylpy#output-format) |

Expand Down Expand Up @@ -88,27 +88,27 @@ The [CEMBA.wdl](https://github.com/broadinstitute/warp/blob/develop/pipelines/ce

## CEMBA Task Summary

The table and summary sections below detail the tasks and tools of the CEMBA pipeline; [the code](https://github.com/broadinstitute/warp/blob/develop/pipelines/cemba/cemba_methylcseq/CEMBA.wdl) is available through GitHub. Each task can be found in the [CEMBA WDL](https://github.com/broadinstitute/warp/blob/develop/pipelines/cemba/cemba_methylcseq/CEMBA.wdl) If you are looking for the specific parameters of each task/tool, please see the `command {}` section of the WDL script.
The table and summary sections below detail the tasks and tools of the CEMBA pipeline; [the code](https://github.com/broadinstitute/warp/blob/develop/pipelines/cemba/cemba_methylcseq/CEMBA.wdl) is available through GitHub. Each task can be found in the [CEMBA WDL](https://github.com/broadinstitute/warp/blob/develop/pipelines/cemba/cemba_methylcseq/CEMBA.wdl). If you are looking for the specific parameters of each task/tool, please see the `command {}` section of the WDL script.

| Task | Tool(s) | Purpose | Docker |
| :-- | :-- | :-- | :-- |
| Trim | [Cutadapt v1.18](https://cutadapt.readthedocs.io/en/stable/) | Trim adaptors | quay.io/broadinstitute/cutadapt:1.18 |
| CreateUnmappedBam | [Picard v2.18.23](https://broadinstitute.github.io/picard/) | Create uBAM for attaching barcodes | quay.io/broadinstitute/picard:2.18.23 |
| CreateUnmappedBam | [Picard v2.26.10](https://broadinstitute.github.io/picard/) | Create uBAM for attaching barcodes | us.gcr.io/broad-gotc-prod/picard-cloud:2.26.10 |
| ExtractCellBarcodes | [sctools v0.3.4](https://sctools.readthedocs.io/en/latest/sctools.html) | Use whitelist to extract barcodes and tag to uBAM | quay.io/humancellatlas/secondary-analysis-sctools:v0.3.4 |
| Trim | [Cutadapt v1.18](https://cutadapt.readthedocs.io/en/stable/) | Trim degenerate bases, primer index, C/T Adaptase tail of R1 | quay.io/broadinstitute/cutadapt:1.18 |
| Trim | [Cutadapt v1.18](https://cutadapt.readthedocs.io/en/stable/) | Trim bases, primer index, C/T Adaptase tail of R2 | quay.io/broadinstitute/cutadapt:1.18 |
| Align | [Bismark v0.21.0](https://www.bioinformatics.babraham.ac.uk/projects/bismark/) | Map multiplexed samples as single-end with --bowtie2 | quay.io/broadinstitute/bismark:0.21.0 |
| Sort | [Picard v2.18.23](https://broadinstitute.github.io/picard/) | Sort BAM(s) in coordinate order | quay.io/broadinstitute/picard:2.18.23 |
| FilterDuplicates | [Picard v2.18.23](https://broadinstitute.github.io/picard/) | Removes duplicate reads from BAM | quay.io/broadinstitute/picard:2.18.23 |
| Sort | [Picard v2.26.10](https://broadinstitute.github.io/picard/) | Sort BAM(s) in coordinate order | us.gcr.io/broad-gotc-prod/picard-cloud:2.26.10 |
| FilterDuplicates | [Picard v2.26.10](https://broadinstitute.github.io/picard/) | Removes duplicate reads from BAM | us.gcr.io/broad-gotc-prod/picard-cloud:2.26.10 |
| Get MethylationReport |[Bismark v0.21.0](https://www.bioinformatics.babraham.ac.uk/projects/bismark/) | Produce methylation report for duplicates-filtered BAM |quay.io/broadinstitute/bismark:0.21.0 |
| FilterMapQuality | [Samtools v1.9](http://www.htslib.org/) | Further filter duplicate-removed BAM by map quality | quay.io/broadinstitute/samtools:1.9 |
| GetMethylationReport | [Bismark v0.21.0](https://www.bioinformatics.babraham.ac.uk/projects/bismark/) | Produce methylation report for reads above map quality and below map quality | quay.io/broadinstitute/bismark:0.21.0 |
| AttachBarcodes | [Picard v2.18.23](https://broadinstitute.github.io/picard/) | Add barcodes from the tagged uBAM to the aligned BAM | quay.io/broadinstitute/picard:2.18.23 |
| AttachBarcodes | [Picard v2.26.10](https://broadinstitute.github.io/picard/) | Add barcodes from the tagged uBAM to the aligned BAM | us.gcr.io/broad-gotc-prod/picard-cloud:2.26.10 |
| MergeBams | [Samtools v.19](http://www.htslib.org/) | Merge R1 and R2 BAM files into single BAM | quay.io/broadinstitute/samtools:1.9 |
| AddReadGroup | [GATK v4.1.2.0](https://gatk.broadinstitute.org/hc/en-us) | Add read groups to the merged BAM | us.gcr.io/broad-gatk/gatk:4.3.0.0 |
| Sort | [Picard v2.18.23](https://broadinstitute.github.io/picard/) | Sort in coordinate order after adding read group | quay.io/broadinstitute/picard:2.18.23 |
| AddReadGroup | [GATK v4.3.0.0](https://gatk.broadinstitute.org/hc/en-us) | Add read groups to the merged BAM | us.gcr.io/broad-gatk/gatk:4.3.0.0 |
| Sort | [Picard v2.26.10](https://broadinstitute.github.io/picard/) | Sort in coordinate order after adding read group | us.gcr.io/broad-gotc-prod/picard-cloud:2.26.10 |
| IndexBam | [Samtools v1.9](http://www.htslib.org/) | Index the output BAM | quay.io/broadinstitute/samtools:1.9 |
| MethylationTypeCaller | [GATK v4.1.2.0](https://gatk.broadinstitute.org/hc/en-us) | Produce a VCF with locus-specific methylation information | us.gcr.io/broad-gatk/gatk:4.3.0.0 |
| MethylationTypeCaller | [GATK v4.3.0.0](https://gatk.broadinstitute.org/hc/en-us) | Produce a VCF with locus-specific methylation information | us.gcr.io/broad-gatk/gatk:4.3.0.0 |
| VCFtoALLC | Python | Creates an [ALLC](https://github.com/yupenghe/methylpy#output-format) file from the VCF produced with MethylationTypeCaller | quay.io/cemba/vcftoallc:v0.0.1 |
| ComputeCoverageDepth | [Samtools v1.9](http://www.htslib.org/) | Compute number of sites with coverage greater than 1 | quay.io/broadinstitute/samtools:1.9 |

Expand Down Expand Up @@ -181,7 +181,7 @@ All CEMBA pipeline releases are documented in the [CEMBA changelog](https://gith
Please identify the pipeline in your methods section using the CEMBA Pipeline's [SciCrunch resource identifier](https://scicrunch.org/scicrunch/Resources/record/nlx_144509-1/SCR_021219/resolver?q=CEMBA&l=CEMBA).
* Ex: *CEMBA MethylC Seq Pipeline (RRID:SCR_021219)*

## Consortia Support
## Consortia Support
This pipeline is supported and used by the [BRAIN Initiative Cell Census Network](https://biccn.org/) (BICCN).

If your organization also uses this pipeline, we would love to list you! Please reach out to us by contacting [the WARP team](mailto:[email protected]).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ slug: /Pipelines/Exome_Germline_Single_Sample_Pipeline/README

| Pipeline Version | Date Updated | Documentation Author | Questions or Feedback |
| :----: | :---: | :----: | :--------------: |
| [ExomeGermlineSingleSample_v3.1.13](https://github.com/broadinstitute/warp/releases?q=ExomeGermlineSingleSample_v3.0.0&expanded=true) | November, 2021 | [Elizabeth Kiernan](mailto:[email protected]) | Please file GitHub issues in WARP or contact [the WARP team](mailto:[email protected]) |
| [ExomeGermlineSingleSample_v3.1.15](https://github.com/broadinstitute/warp/releases?q=ExomeGermlineSingleSample_v3.0.0&expanded=true) | December, 2023 | [Elizabeth Kiernan](mailto:[email protected]) | Please file GitHub issues in WARP or contact [the WARP team](mailto:[email protected]) |


The Exome Germline Single Sample pipeline implements data pre-processing and initial variant calling according to the GATK Best Practices for germline SNP and Indel discovery in human exome sequencing data.
Expand All @@ -27,8 +27,8 @@ The Exome Germline Single Sample workflow is written in the Workflow Description

### Software Version Requirements

* [GATK 4.1.8.0](https://github.com/broadinstitute/gatk/releases/tag/4.1.8.0)
* Picard 2.23.8
* [GATK 4.3.0.0](https://github.com/broadinstitute/gatk/releases/tag/4.3.0.0)
* Picard 2.26.10
* Samtools 1.11
* Python 3.0
* Cromwell version support
Expand Down Expand Up @@ -136,7 +136,7 @@ This material is provided by the Data Science Platform group at the Broad Instit

## Licensing

Copyright Broad Institute, 2020 | BSD-3
Copyright Broad Institute, 2023 | BSD-3

The workflow script is released under the **WDL open source code license (BSD-3)** (full license text at https://github.com/broadinstitute/warp/blob/master/LICENSE). However, please note that the programs it calls may be subject to different licenses. Users are responsible for checking that they are authorized to run all programs before running this script.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
sidebar_position: 1
slug: /Pipelines/Illumina_Genotyping_Arrays_Pipeline/README
---

# Illumina Genotyping Array Overview
Expand Down
4 changes: 2 additions & 2 deletions website/docs/Pipelines/Optimus_Pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ slug: /Pipelines/Optimus_Pipeline/README

| Pipeline Version | Date Updated | Documentation Author | Questions or Feedback |
| :----: | :---: | :----: | :--------------: |
| [optimus_v6.2.2](https://github.com/broadinstitute/warp/releases?q=optimus&expanded=true) | December, 2023 | Elizabeth Kiernan | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |
| [optimus_v6.3.0](https://github.com/broadinstitute/warp/releases?q=optimus&expanded=true) | December, 2023 | Elizabeth Kiernan | Please file GitHub issues in warp or contact [the WARP team](mailto:[email protected]) |

![Optimus_diagram](Optimus_diagram.png)

Expand Down Expand Up @@ -47,7 +47,7 @@ To download the latest Optimus release, see the release tags prefixed with "Opti

To discover and search releases, use the WARP command-line tool [Wreleaser](https://github.com/broadinstitute/warp/tree/master/wreleaser).

If you’re running an Optimus workflow version prior to the latest release, the accompanying documentation for that release may be downloaded with the source code on the WARP [releases page](https://github.com/broadinstitute/warp/releases) (see the source code folder website/pipelines/skylab/optimus").
If you’re running an Optimus workflow version prior to the latest release, the accompanying documentation for that release may be downloaded with the source code on the WARP [releases page](https://github.com/broadinstitute/warp/releases) (see the source code folder `website/docs/Pipelines/Optimus_Pipeline`).

Optimus can be deployed using [Cromwell](https://cromwell.readthedocs.io/en/stable/), a GA4GH compliant, flexible workflow management system that supports multiple computing platforms. The workflow can also be run in [Terra](https://app.terra.bio), a cloud-based analysis platform. The Terra [Optimus Featured Workspace](https://app.terra.bio/#workspaces/featured-workspaces-hca/HCA_Optimus_Pipeline) contains the Optimus workflow, workflow configurations, required reference data and other inputs, and example testing data.

Expand Down
Loading

0 comments on commit 2ee5a81

Please sign in to comment.