Skip to content
Damien Farrell edited this page Feb 16, 2017 · 1 revision

There are two basic was to count the RNAs in your fastq files, by aligning to the reference genome or to a library of known sequences. The latter is faster and simpler but limited to known genes and cannot deal as well with ambiguous reads. More complex analyses and novel discovery are possible using the reference genome.

Alignment

Currently bowtie and subread are supported for alignment. You can first set the alignment parameters BOWTIE_PARAMS and SUBREAD_PARAMS if you want to alter the default settings.

By default bowtie is mapped to library sequences using -n 1 -l 20 which allows one mismatch up to 20 and ignores mismatches beyond that so that non-templated additions at the 3' end can be included. See the isomiR counting section.

Mapping to a library of miRBase sequences

miRNA-seq reads can be mapped to the known mirbase sequences as a quick way to count known mature mirnas and may often be adequate. You can use the map_mirbase function to do this in one step. This does the following:

  • creates the mature sequences with flanking nucleotides
  • creates a bowtie/subread index
  • aligns and counts the reads
  • counts isomirs
  • returns a pandas dataframe

A file called mature_counts.csv will also be saved. isomirs are saved as isomir_counts.csv

Example:

import smallrnaseq as smrna
res = smrna.map_mirbase(files=['test_1.fastq','test_2.fastq'], overwrite=True, aligner='bowtie', 
                        species='hsa', pad5=3, pad3=5)

Mapping to the genome

This requires a reference genome and a gtf file with miRNA features.

Example:

featcounts = srseq.map_genome_features(['test_1.fastq'], 'bos_taurus', gtffile, 
                                        outpath='ncrna_map', aligner='subread', merge=True)

Command line

You can call mirna counting from the command line without using Python commands. The key to using this is the config file. For mirnas you need to add the following settings:


Then call the command using:

smallrnaseq -c mymirs.conf -r

Links

http://bowtie-bio.sourceforge.net/manual.shtml

http://bioinf.wehi.edu.au/subread/