-
Notifications
You must be signed in to change notification settings - Fork 19
mirdeep2 module
mirdeep2 is a popular miRNA discovery program written in perl. The command line usage is a little complicated so this module allows a configuration file to be used to provide settings. It also analyses the output results of mirdeep and provides summary plots. Requires you install mirdeep2 and add it to your path.
Installing smallrnaseq will add a mirdeep command that can be used without any python coding.
Run mirdeep using a configuration file
mirdeep2 -c <mdp.conf> -r
Config file format is as follows:
[base]
input = /path/to/input files #(fasta or fastq)
adapter =
filetype = fastq
bowtieindex = /path/to/bowtie_index
refgenome = /path/to/ref_genome.fa
species = bta
mature =
hairpin =
other = hsa
randfold = 1
mirbase = /path/to/mirbase_file
overwrite = 0
from smallrnaseq import mirdeep2
#run mirdeep2 on multiple fastq files, stored in path
mirdeep2.run_multiple(path, fasta=False)
#get mirdeep results as a Pandas DataFrame
df = mirdeep2.get_results(path)
#filter the mirnas according to various parameters
df = mirdeep2.filter_expr_results(score=0, freq=30, meanreads=500)
Running the command line program produces the usual mirdeep2 output. This can then be analysed further using this command:
mirdeep2 -a <results_folder>
This produces files called known_mirdeep.csv and novel_mirdeep.csv along with various plots.
When using Python the command mirdeep2.get_results(path) produces a pandas DataFrame of the following form:
read_count precursor total s01 s02 s03 s04 \
#miRNA
bta-miR-486 78263227.0 bta-mir-486 78263227.0 3701115.0 629648.0 4241904.0 6742809.0
bta-miR-92a 2997019.0 bta-mir-92a-1 2997019.0 148076.0 28384.0 196887.0 214518.0
bta-miR-191 1005259.0 bta-mir-191 1005259.0 32274.0 8205.0 63262.0 44913.0
bta-miR-25 939122.0 bta-mir-25 939122.0 35888.0 8639.0 47370.0 61931.0
bta-miR-142-5p 497928.0 bta-mir-142 497928.0 7098.0 7684.0 14487.0 60216.0
s05 s06 s07 ... UCSC browser NCBI blastn \
#miRNA ...
bta-miR-486 3829302.0 2646240.0 7186343.0 ... - -
bta-miR-92a 129283.0 80550.0 191009.0 ... - -
bta-miR-191 40074.0 17177.0 45155.0 ... - -
bta-miR-25 55032.0 30991.0 47742.0 ... - -
bta-miR-142-5p 16127.0 11496.0 43682.0 ... - -
consensus mature sequence consensus star sequence \
#miRNA
bta-miR-486 uccuguacugagcugccccga cgggucagcucaguaccgggc
bta-miR-92a uauugcacuugucccggccugu agguugggaucgguugcaaugcu
bta-miR-191 caacggaaucccaaaagcagcug cugcgcuuggauuucguuccc
bta-miR-25 cauugcacuugucucggucuga aggcggagacuugggcaauugcu
bta-miR-142-5p cccauaaaguagaaagcacu aguguuuccuacuuuauggaug
consensus precursor sequence \
#miRNA
bta-miR-486 uccuguacugagcugccccgaggcccuucgcugugcccagcucgggucagcucaguaccgggc
bta-miR-92a agguugggaucgguugcaaugcuguguuucuguaugguauugcacuugucccggccugu
bta-miR-191 caacggaaucccaaaagcagcuguugucuccagagcauuccagcugcgcuuggauuucguuccc
bta-miR-25 aggcggagacuugggcaauugcuggacgcugccccgggcauugcacuugucucggucuga
bta-miR-142-5p cccauaaaguagaaagcacuacuaacagcacuggaggguguaguguuuccuacuuuauggaug
precursor coordinate novel chr seed mean_norm
#miRNA
bta-miR-486 27:36261847..36261910:- False 27 ccuguac 848061.757500
bta-miR-92a 12:66227265..66227324:+ False 12 auugcac 32037.220000
bta-miR-191 22:51543484..51543548:+ False 22 aacggaa 9999.677500
bta-miR-25 25:36892462..36892522:+ False 25 auugcac 9859.833333
bta-miR-142-5p 19:9527315..9527378:- False 19 ccauaaa 5490.591667