riboFrame: an improved method for microbial taxonomy profiling from non-targeted metagenomics Matteo Ramazzotti, Luisa Berná, Claudio Donati and Duccio Cavalieri
We present riboFrame, a novel procedure for microbial profiling based on the identification and classification of 16S rRNA sequences in non-targeted metagenomics datasets. Reads overlapping the 16S rRNA genes are identified and positioned onto a model 16S gene by HMMER using Hidden Markov Models or bacteria and archaea. A coverage plot can be created to evaluate whether we have a sufficient number of reads to continue to the following step. Next, a phylum-to-genus taxonomic assignment is given to all reads by naïve Bayesian classification using RDPclassifier. In the end, microbial abundance profiles can be created using all the reads available or just a subset from specific locations of the 16S genes (e.g. the popular V1-V3, V3-V5 or V6-V9 regions). We called this "post-hoc topological selection". In practice, we have transformed a non-targeted metagenomic experiments into a multi-targeted 16S microbial survey experiment.