Command line interface

Installing the package provides the command smallrnaseq in your path. This allows users is a command line interface to the library without the need for any Python coding at all. It provides a set of pre-defined functions with parameters specified in a text configuration file.

Usage

Usage solely involves setting up the config file and having your input files prepared. Running the command smallrnaseq -r will create a default config file for you. You can then edit this. Then run:

smallrnaseq -c default.conf -r

Config file settings

The advantage of configuration files is avoiding long commands that have to be re-typed or are prone to mistakes. Also the files can be kept to recall what setting we used or to copy them for another set of files. The current options available in the file are as follows. The meanings of each option is explained explained below. If you are unsure or don't require a setting, leave it at the default.

[base]
filename = 
path = testfiles 
filetype = fastq
index_path = indexes 
aligner = bowtie  
bowtie_params = -v 1 --best 
ref_genome =   
features = 
indexes = RFAM,mirbase-hsa 
output = smrna_results  
counting = default
add_labels = 0 
mirbase = 0 
species = bta 
pad5 = 3
pad3 = 5

Settings explained:

name	example value	meaning
filename	test.fastq	input fastq file with reads
path	testfiles	folder containing fastq files instead of a single file
filetype	fastq
index_path	indexes	location of bowtie or subread indexes
aligner	bowtie	which aligner to use, bowtie or subread
bowtie_params	-v 1 --best	alignment parameters
ref_genome	hg19	reference genome index name
features	Homo_sapiens.GRCh37.75.gtf	genome annotation file
indexes	RFAM,mirbase-hsa	names of annotated library indexes to map to
output	smrna_results	output folder for temp files
counting	default	method of feature counting
add_labels	0	whether to add labels to replace the file names in the results
mirbase	0	map to mirbase only
species	bta	mirbase species to use
pad5	3	3' flanking bases to add when generating mature mirbase sequences
pad3	5	5' flanking bases to add

Example

Say we have a set of fastq files in the folder 'testfiles' that we want to count miRNAs in. We would simply set the options mirbase = 1 and path = testfiles. If your file names are long and you want to replace them with short ids, set add_labels = 1. This also writes out a file called samplelabels.csv' in the output folder. Note if just mapping to mirbase we don't have to set an index file since it is generated automatically.

Outputs

The main outputs are csv files with the counts for each sample in a column, along with normalised count column. These csv files can be opened in a spreadsheet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Command line interface

Usage

Config file settings

Example

Outputs

Clone this wiki locally