HaemoSeq

HaemoSeq is a bioinformatic pipeline for whole genome sequencing based analysis of human-related Haemophilus isolates including H. influenzae (typeable and NTHi), H. haemolyticus (subsp. intermedius), H. parahaemolyticus, H. paraphrohaemolyticus, H. parainfluenzae, H. sputorum, H. pittmaniae and H. ducreyi. It runs on linux and accepts FastQ (illumina) and FastA files as input. It includes:

Quality control of raw sequence reads [Input: FastQ files; Required tools: FastQC and multiQC]
Contamination detection [Input: FastQ files or FastA files; Required tools: kraken2]
Preprocessing of raw sequence reads (e.g. adapter removal) [Required tools: fastp]
Haemophilus (sub)species and serotype prediction using a custom marker database [Input: FastQ files; Required tools: srst2 and GNU parallel; Required database: Class_Haemophilus_and_Serotyping-v2.0.fasta]
Multi-locus sequence typing (MLST) [Input: FastQ files; Required tools: srst2; Required database: pubMLST]
Assembly of sequence reads [Input: FastQ files; Required tools: Shovill; Output: FastA files]
Fast phylogenetic analysis using Mashtree [Input: FastQ files or FastA files; Required tools: Mashtree]
Detection of known plasmids [Input: FastQ files; Required tools: srst2 and seqkit; Required database: PLSDB or custom]
De novo prediction of plasmid contigs [Input: FastQ files; Required tools: platon and/or plasmidspades]
Resistance and virulence gene prediction [Input: FastA files; Required tools: AMRfinder+]

Getting Started

For complete installation instructions, description and usage examples please send a mail to mdiricks@fz-borstel.de.

Citation

Diricks et al., 2022: https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-022-01017-x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

HaemoSeq

Getting Started

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

HaemoSeq

Getting Started

Citation