Home

Welcome to the RNA_Editing_Detection_Pipeline wiki!

Usage:

Download Reference Data

Create a tab-delimited file containing the urls to all required reference data keeping the first column identical to the example.

Example reference_data.txt:

genome  ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_30/GRCh37_mapping/GRCh37.primary_assembly.genome.fa.gz
genome_annotation       ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_30/GRCh37_mapping/gencode.v30lift37.annotation.gtf.gz
strand_detection        https://sourceforge.net/projects/rseqc/files/BED/Human_Homo_sapiens/hg19_RefSeq.bed.gz
rmsk    http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/rmsk.txt.gz
dbSNP   http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/snp151.txt.gz
rediportal_db   http://srv00.recas.ba.infn.it/webshare/rediportalDownload/table1_full.txt.gz

Run get_reference_data.py to download all required data into specified directory.

Parameters:

-i or --input: path to tab-delimited file containing data urls
-o or --output: path to output directory

Example:

nohup python3 get_reference_data.py -i reference_data.txt -o output_path &

Index Genome for STAR

Run index_genome_STAR.py to index the genome for STAR.

Parameters:

-f or --fasta: path to genome fasta file
-a or --gtf_annotation: path to genome gtf annotation
-o or --output: path to output directory

Example:

nohup python3 index_genome_STAR.py -f genome.fa -a annotation.gtf -o index_output/ &

Retrieve Fastq Files from SRA

Create a txt file containing a list of SRA accession numbers.
Run get_SRA_data.py to download data

Parameters:

-a or --acc_list: path to file containing list of SRA accession numbers
-o or --output: path to output directory

Example:

nohup python3 get_SRA_data.py -a acc.txt -o output_path &

Trim RNAseq Reads

Run fastp.py to trim RNAseq Reads

Parameters:

-se or --single_end: include at beginning of parameters if data is single end
-f or --fastq_dir: path to fastq directory
-o or --output: path to output directory

Example:

PE data

nohup python3 fastp.py -f fastq_dir -o output_dir &

SE data

nohup python3 fastp.py -se -f fastq_dir -o output_dir &

Align RNAseq Reads

Make sure genome has been indexed for STAR
Run align_STAR.py to align paired-end data to the genome

Parameters:

-f or --fastq_dir: path to directory containing fastq files
-g or --genome_idx: path to STAR genome index
-o or --output: path to output directory

Example:

nohup python align_STAR.py -f fastq_dir -g genome_index -o output_dir &

Download Fastq Files of WGS from SRA

Create a txt file containing a list of ERR accession numbers.
Run get_WGS_data.py to download data

Parameters:

-a or --acc_list: path to file containing list of SRA accession numbers
-o or --output: path to output directory

Example:

nohup python3 get_WGS_data.py -a acc.txt -o output_path &

Align DNAseq Reads

Run align_bwa.py to align paired-end data to the genome

Parameters:

-fq or --fastq_dir: path to directory containing fastq files
-fa or --fasta_dir: path to directory containing genome fasta file

Example:

nohup python3 align_bwa.py -fq fastq_dir -fa fasta_dir &

Select and map reads to a chromosome

Run select_map_chr.py to select and map reads to a specific chromosome

Parameters:

-g or --genome_dir: path to directory containing the genome .fai file
-s or --sam_dir: path to directory containing the genome sam file
-o or --output_dir: path to directory store the output files
-chr or --chrNum: select the chromosome number as 'chr[Int]' (e.g. -chr chr21)

Example:

nohup python3 select_map_chr.py -g genome_dir -s sam_dir -o output_dir -chr chrNum &

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Usage:

Download Reference Data

Index Genome for STAR

Retrieve Fastq Files from SRA

Trim RNAseq Reads

Align RNAseq Reads

Download Fastq Files of WGS from SRA

Align DNAseq Reads

Select and map reads to a chromosome

Clone this wiki locally