Skip to content

Command line interface

Damien Farrell edited this page Feb 18, 2017 · 13 revisions

Installing the package provides the command smallrnaseq in your path. This allows users is a command line interface to the library without the need for any Python coding at all. It provides a set of pre-defined functions with parameters specified in a text configuration file.

Usage

Usage solely involves setting up the config file and having your input files prepared. Running the command smallrnaseq -r will create a default config file for you. You can then edit this. Then run:

smallrnaseq -c default.conf -r

Config file settings

The advantage of configuration files is avoiding long commands that have to be re-typed or are prone to mistakes. Also the files can be kept to recall what setting we used or to copy them for another set of files. The current options available in the file are as follows. The meanings of each option is explained explained below. If you are unsure or don't require a setting, leave it at the default.

[base]
filename = 
path = testfiles 
filetype = fastq
index_path = indexes 
aligner = bowtie  
bowtie_params = -v 1 --best 
ref_genome =   
features = 
indexes = RFAM,mirbase-hsa 
output = smrna_results  
counting = default
add_labels = 0 
mirbase = 0 
species = bta 
pad5 = 3
pad3 = 5

Settings explained:

name example value meaning
filename test.fastq input fastq file with reads
path testfiles folder containing fastq files instead of a single file
filetype fastq
index_path indexes location of bowtie or subread indexes
aligner bowtie which aligner to use, bowtie or subread
bowtie_params -v 1 --best alignment parameters
ref_genome hg19 reference genome index name
features Homo_sapiens.GRCh37.75.gtf genome annotation file
indexes RFAM,mirbase-hsa names of annotated library indexes to map to
output smrna_results output folder for temp files
counting default method of feature counting
add_labels 0 whether to add labels to replace the file names in the results
mirbase 0 map to mirbase only
species bta mirbase species to use
pad5 3 3' flanking bases to add when generating mature mirbase sequences
pad3 5 5' flanking bases to add

Example

Say we have a set of fastq files in the folder 'testfiles' that we want to count miRNAs in. We would simply set the options mirbase = 1 and path = testfiles. If your file names are long and you want to replace them with short ids, set add_labels = 1. This also writes out a file called samplelabels.csv' in the output folder. Note if just mapping to mirbase we don't have to set an index file since it is generated automatically.

Outputs

The main outputs are csv files with the counts for each sample in a column, along with normalised count column. These csv files can be opened in a spreadsheet.

Clone this wiki locally