SLIDESORT-BPR

Chromosomal rearrangement events are caused by abnormal breaking and rejoining of DNA molecules. They are responsible for many of the cancer related diseases. Detecting the DNA breaking and repairing mechanism, therefore, may offer vital clues about the pathologic causes and diagnostic/therapeutic target of these diseases. But this effort also poses considerable challenges, because the structural variations and the genomes are different from one person to another. Intermediate comparison via reference genome could lead to the loss information. Unlike the current methods which make use the reference genome, we developed a method to detect the breakpoint reads directly from observing the differences between two (or more) NGS short reads samples. Slidesort-BPR is a command line tool implemented in C++.

Input format

Multi-fasta format is acceptable. Break Point Reads Detector accepts both DNA sequence and protein sequence.

> seq 1
ATGCTAGCTGATACATCTAGCTCGTACGTACGTCAGTCGTAGT
CTGACTGACTAGCTAGCTAGCATCGTACGTACGTCGTAGCTAC
> seq 2
ATGCTAGCTGATACATCTAGCTCGTACGTACGTCAGTCGTAGT
CTGACTGACTAGCTAGCTAGCATCGTACGTACGTCGTAGCTAC

Execution

Initially, the user is required to set the path by executing this line:

export LD_LIBRARY_PATH=.

Typical usage is as follows:

./bpr-detector -d <distance> -IC <input_control_fasta_file> -IT <input_target_fasta_file> -sl <size_of_splitted_sequence> -b <value of TAU> -o <output_file> -M I (-M:default set to 'I')

Sequences can include unknown characters. In default setting, Slidesort-BPR excludes sequences with unknown characters such as N, X. To include sequences with unknown characters, use -u option.

Options

Basic options:

-d  distance threshold
-IT input target file name
-IC input control file name
-o  output filename
-sl size of splitted sequence
-M  method of caluclating M-value I: calculate from the mean degree of target, T: calculate from the table(default=I)
-MD Depth(when you set -M T, It is necessary to specify this value)
-ME Error Rate(when you set -M T, It is necessary to specify this value)
-b  parameter of TAU
-t  distance type  E: edit-distance  H: hamming-distance (default=E)
-v  search with both original seq and reverse complement seq.

Advanced options:

-c  type of input string.
    DNA: DNA seq, PROTEIN: protein seq, INT: integer seq (default=DNA)
-g  gap extention cost (default=1, must be positive value. it is better to use larger value to avoid slow-down of the search.)
-G  gap open cost (default=0, must be positive value)
-k  size of sorting key
-u  do not exclude sequences with unknown character. ex) n, N, Z, etc...
-V  output a same pair twice if dist(A,B)<=d and dist(A,B')<=d. B' is reverse complement of B. (use with -v)
-mt number of threads (multi-threading mode)
-mp number of tasks (multi-threading mode)
-mr ratio of tasks(<number of tasks> = <number of threads> * <this value>) (multi-threading mode)

Publication

Edward Wijaya, Kana Shimizu, Kiyoshi Asai and Michiaki Hamada, Reference-free prediction of rearrangement breakpoint reads,(2014), Bioinformatics, 30(18):2559-67. PMID:24876376

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
scripts		scripts
README.md		README.md
TABLE.txt		TABLE.txt
bpr-detector		bpr-detector
cancer.fasta		cancer.fasta
common.h		common.h
libmslidesort.so		libmslidesort.so
libslidesort.so		libslidesort.so
mscls.h		mscls.h
normal.fasta		normal.fasta
parallelslidesort.h		parallelslidesort.h
param.h		param.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SLIDESORT-BPR

Input format

Execution

Options

Publication

About

Releases 1

Packages

Languages

ewijaya/slidesort-bpr

Folders and files

Latest commit

History

Repository files navigation

SLIDESORT-BPR

Input format

Execution

Options

Publication

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages