Skip to content

Commit

Permalink
Merge pull request #19 from ODHL/main
Browse files Browse the repository at this point in the history
main and dev sync
  • Loading branch information
slsevilla authored Dec 19, 2023
2 parents 3203222 + 9f0d580 commit cc162bb
Show file tree
Hide file tree
Showing 129 changed files with 265,511 additions and 1,610 deletions.
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,10 @@
- Review change logs associated with Phonenix and DRYAD repos for information on their previous versions, from which this pipeline was built.

**Version 1.0**
- Currently in development.
- Complete with addition of Phoenix and Dryad components

**Version 1.1**
- AR report added

**Version 2.0**
- Update Phoenix to v2.0.1
69 changes: 42 additions & 27 deletions Dockerfiles/Dockerfile_Terra
Original file line number Diff line number Diff line change
@@ -1,47 +1,62 @@
FROM mambaorg/micromamba:0.27.0
#USER root

# for easy upgrade later. ARG variables only persist during image build time
ARG PHOENIX_VER="1.1.0"
FROM mambaorg/micromamba:1.4.3

# metadata
LABEL base.image="mambaorg/micromamba:0.27.0"
LABEL base.image="mambaorg/micromamba:1.4.3"
LABEL dockerfile.version="1"
LABEL software="phoenix"
LABEL software.version="1.1.0"
LABEL software.version="2.0.0"
LABEL description="PHoeNIx: A short-read pipeline for healthcare-associated and antimicrobial resistant pathogens"
LABEL website="https://github.com/cdcgov/phoenix/"
LABEL license="https://github.com/CDCgov/phoenix/blob/main/LICENSE"
LABEL maintainer="Jill Hagey"
LABEL maintainer.email="[email protected]"

RUN micromamba create -n phoenix -y \
bioconda::bbmap==39.01 \
bioconda::multiqc==1.11 \
bioconda::fastp==0.23.2 \
bioconda::mash==2.3 \
bioconda::mlst==2.23.0 \
bioconda::fastqc==0.11.9 \
bioconda::gamma==2.2 \
bioconda::fastani==1.33 \
# creating base environment
RUN micromamba create -n phoenix -c defaults -c bioconda -c conda-forge \
conda-forge::python \
bioconda::ncbi-amrfinderplus==3.10.45 \
bioconda::kraken2==2.1.2 \
bioconda::quast==5.0.2 \
bioconda::spades==3.15.5 \
bioconda::krona==2.8.1 \
bioconda::prokka==1.14.5 \
conda-forge::biopython \
conda-forge::rsync \
conda-forge::xlsxwriter \
conda-forge::bc \
conda-forge::wget \
conda-forge::ca-certificates \
conda-forge::procps-ng \
conda-forge::coreutils \
bioconda::nf-core \
anaconda::graphviz \
conda-forge::openssl \
conda-forge::gsutil \
bioconda::nextflow==22.04.5 && \
micromamba clean -a -y
conda-forge::pigz \
anaconda::graphviz \
conda-forge::libcurl \
# need this for mash: error while loading shared libraries: libgsl.so.25: https://github.com/ParBLiSS/FastANI/issues/96
conda-forge::gsl=2.7=he838d99_0 \
# you need this not the Jetbrains versio of java https://github.com/nextflow-io/nextflow/issues/2841
conda-forge::openjdk \
bioconda::nf-core \
bioconda::bbmap=39.01 \
bioconda::multiqc=1.14 \
bioconda::fastp=0.23.2 \
bioconda::mash=2.3 \
bioconda::mlst=2.23.0 \
bioconda::fastqc=0.11.9 \
bioconda::gamma=2.2 \
bioconda::sra-tools=3.0.3 \
bioconda::fastani=1.33 \
bioconda::entrez-direct=16.2 \
bioconda::kraken2=2.1.2 \
bioconda::spades=3.15.5 \
bioconda::krona=2.8.1 \
bioconda::prokka=1.14.5 \
bioconda::quast=5.0.2 \
bioconda::nextflow=22.04.5 && \
micromamba clean -a -y

RUN micromamba create -n busco -c conda-forge -c bioconda busco=5.4.7 && micromamba clean -a -y
RUN micromamba create -n amrfinderplus -c conda-forge bioconda::ncbi-amrfinderplus=3.11.11 && micromamba clean -a -y
RUN micromamba create -n srst2 -c conda-forge bioconda::srst2==0.2.0 && micromamba clean -a -y


ENV PATH=/opt/conda/envs/phoenix/bin:/opt/conda/envs/amrfinderplus/bin:/opt/conda/envs/srst2/bin:/opt/conda/envs/busco/bin:\
/opt/conda/bin:/opt/conda/envs/env/bin:/opt/conda/envs/env/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

ENV PATH=/opt/conda/envs/phoenix/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#setting up stuff for BUSCO
ENV AUGUSTUS_CONFIG_PATH=/opt/conda/envs/busco/config/
46 changes: 34 additions & 12 deletions Dockerfiles/Dockerfile_base
Original file line number Diff line number Diff line change
@@ -1,26 +1,39 @@
# base image
FROM ubuntu:kinetic
FROM ubuntu:jammy

# for easy upgrade later. ARG variables only persist during image build time
ARG PHX_VER="1.1.0"
ARG PHX_VER="2.0.0"

# metadata
LABEL base.image="ubuntu:kinetic"
LABEL base.image="ubuntu:jammy"
LABEL dockerfile.version="2"
LABEL software="PhoeNIx"
LABEL software.version="v1.1.0"
LABEL software.version="v2.0.0"
LABEL description="Basic Linux for Running PHoeNIx bash scripts"
LABEL website="https://github.com/cdcgov/phoenix"
LABEL license="Apache 2.0"
LABEL maintainer="Jill Hagey"
LABEL maintainer.email="[email protected]"

# prevents having to enter commands during apt-get install
ENV DEBIAN_FRONTEND=noninteractive

# install dependencies (pigz needed for nf-core modules, bc and rsync needed for post gamma steps in PHoeNIx)
RUN apt-get update && apt-get -y --no-install-recommends install \
python3.10 \
python3-pip \
python3-dev \
python-is-python3 \
ca-certificates \
libssl-dev \
zlib1g-dev \
libbz2-dev \
libreadline-dev \
libsqlite3-dev \
make \
llvm \
libncurses5-dev \
libncursesw5-dev \
xz-utils \
tk-dev \
libffi-dev \
liblzma-dev \
build-essential \
bc \
pigz \
Expand All @@ -33,7 +46,18 @@ RUN apt-get update && apt-get -y --no-install-recommends install \
apt-get autoclean && \
rm -rf /var/lib/apt/lists/*

#install biopython
# using pyenv to set up an environment for python v3.7.12 to match what is required for the terra container
RUN mkdir /pyenv && git clone https://github.com/pyenv/pyenv.git /pyenv
ENV PYENV_ROOT=/pyenv
RUN /pyenv/bin/pyenv install 3.7.12
RUN eval "$(/pyenv/bin/pyenv init -)" && /pyenv/bin/pyenv local 3.7.12
RUN /pyenv/bin/pyenv global 3.7.12

RUN apt-get update && apt-get -y --no-install-recommends install python3-pip

ENV PATH=/pyenv/bin:/pyenv/shims:${PATH}

#install biopython and other required modules
RUN pip3 install biopython \
glob2 \
argparse \
Expand All @@ -44,6 +68,4 @@ RUN pip3 install biopython \
times \
xlsxwriter \
cryptography==36.0.2 \
pytest-shutil

ENV PATH=${PATH}:/usr/bin/python3
pytest-shutil
14 changes: 11 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,19 @@
## Background
This pipeline was built from components of two pipelines:

1) PHoeNIx: A short-read pipeline for healthcare-associated and antimicrobial resistant pathogens
2) DRYAD:
1) [PHoeNIx](https://github.com/CDCgov/phoenix): A short-read pipeline for healthcare-associated and antimicrobial resistant pathogens
2) [DRYAD](https://github.com/wslh-bio/dryad): A pipeline to construct reference free core-genome or SNP phylogenetic trees for examining prokaryote relatedness in outbreaks

## Dependencies
## Databases
The following databases are utilized to generate the data within this pipeline:

- [AMRFinderPlus database](https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/): [Version 2023-04-17.1](https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/3.11/)
- [ARG-ANNOT database](http://backup.mediterranee-infection.com/arkotheque/client/ihumed/_depot_arko/articles/2041/arg-annot-v4-aa-may2018_doc.fasta): [Latest version NT v6 July 2019](https://www.mediterranee-infection.com/acces-ressources/base-de-donnees/arg-annot-2/)
- [ResFinder database](https://bitbucket.org/genomicepidemiology/resfinder_db/src/master/): [v2.1.0](https://bitbucket.org/genomicepidemiology/resfinder_db/commits/branch/master) including until 2023-04-12 commit f46d8fc
- [MLST database](https://github.com/tseemann/mlst): static db generated from [PubMLST.org](https://pubmlst.org/) 2023-05-02
- [Kraken database](https://ccb.jhu.edu/software/kraken2/): [standard-8 db](https://benlangmead.github.io/aws-indexes/k2)

## Dependencies
[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A521.10.3-23aa62.svg?labelColor=000000)](https://www.nextflow.io/)
[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)

Expand Down
Binary file removed assets/ODH_logo.png
Binary file not shown.
24 changes: 7 additions & 17 deletions assets/ar_report_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@ summary.paragraph: |
This report describes the relatedness between a set of bacterial genomes from a suspected outbreak.
methods.text: |
The figures shown here were generated using sequence data processed with the [ODH AST](https://github.com/ODHL/AST_Workflow) data analysis pipeline. SNPs were called using the [Center for Food Safety and Applied Nutrition (CFSAN) SNP](https://github.com/CFSAN-Biostatistics/snp-pipeline) pipeline. If you have questions about this report please contact [Samantha Chill]([email protected]).
The following databases are utilized to generate the data within this report:
1) [AMRFinderPlus database](https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/): [Version 2023-04-17.1](https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/3.11/)
2) [ARG-ANNOT database](http://backup.mediterranee-infection.com/arkotheque/client/ihumed/_depot_arko/articles/2041/arg-annot-v4-aa-may2018_doc.fasta): [Latest version NT v6 July 2019](https://www.mediterranee-infection.com/acces-ressources/base-de-donnees/arg-annot-2/)
3) [ResFinder database](https://bitbucket.org/genomicepidemiology/resfinder_db/src/master/): [v2.1.0](https://bitbucket.org/genomicepidemiology/resfinder_db/commits/branch/master) including until 2023-04-12 commit f46d8fc
4) [MLST database](https://github.com/tseemann/mlst): static db generated from [PubMLST.org](https://pubmlst.org/) 2023-05-02
5) [Kraken database](https://ccb.jhu.edu/software/kraken2/): [standard-8 db](https://benlangmead.github.io/aws-indexes/k2)
disclaimer.text: |
The information included in this report should only be used to support infection prevention measures. This report should not be used to guide treatment decisions, nor should it be included in the patient record.
Whole-genome sequencing analysis is a rapidly evolving technology. Whole-genome sequencing and single nucleotide variant analysis will continue to be adjusted and refined over time due to the varied nature of bacterial genomes, limitations on available reference genomes and continual assessment of the inclusion of mobile genetic elements in this analysis. These results represent the most advanced method currently available for genome comparisons.
Expand All @@ -43,20 +49,4 @@ pangenome.frequency: |
### report methodology params ###

# heatmap distance metric, must be one of the following:
# "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski"
heat.dist.method: 'euclidean'

# show SNP values in the heatmap (TRUE/FALSE)
show.snp: TRUE

# tree rooting method, must be one of the following:
# 'midpoint', 'unrooted', or a sample id
root.method: 'midpoint'

# show tree bootstrap (TRUE/FALSE) and threshold for displying values
show.bootstrap: FALSE
bootstrap.threshold: 80

# path to logo shown in the top left,right corner of the report
logo: 'generator_logo.png'
logo2: 'ODH_logo.png'
# "euclidean", "maximum", "manhattan", "canberr
Loading

0 comments on commit cc162bb

Please sign in to comment.