GrandSTR

Estimation of repeat counts of short tandem repeats(STR) from long-read sequencing data. Get genotypes for known STR.

Dependencies

Python packages:

python: 3.6 or higher
pysam: 0.16.0 or higher
sklearn: 0.24.1 or higher
hmmlearn: 0.2.5 or higher
edlib: 1.2.6 or higher, we need edlib.so file for python bindings.

Dependencies for install:

cython: 0.29.21 or higher

Install

To build align.so, GrandSTR_lib.so, utils_lib.so, deal_str.so, run:

python setup.py build_ext -i

Usage & Examples

Input files

Input bam file is alignment file of read sequences aligned to reference sequences, which is typically generated by minimap2 (versoin 2.17) with parameters "-ax asm10 --MD -Y -L --secondary=no" for hifi reads, or "-ax map-ont --MD -Y -L --secondary=no" for ONT reads. For example:

minimap2 -t 16 -ax asm10 --MD -Y -L --secondary=no hg19.fasta hifi.fastq 2> align.log | samtools view -Sb - | samtools sort - -o hifi.sorted.bam

Input pa file is comma seperated information file, including coordinates of STR regions in reference, and repeat unit sequence. The required columns include STR name, chromosome, start coordinate, end coordinate, and repeat unit sequence. The left columns are optional. For example:

STR000009,1,691243,691307,CACCC,0,,downstream,LOC100288069

Input fasta file is reference genome fasta file.

Commands

For small amount of input STRs provided in pa file, add "-em 0" parameter to GrandSTR program. For example:

cd test/
samtools index hifi.sorted.bam
samtools faidx hg19.fasta
../GrandSTR test1.pa out1 -rf hg19.fasta -bf hifi.sorted.bam -em 0 -rt hifi

For large amount of input STRs provided in pa file, add "-em 1" parameter to GrandSTR program. For example:

cd test/
samtools index hifi.sorted.bam
../GrandSTR test1.pa out1 -rf hg19.fasta -bf hifi.sorted.bam -em 1 -rt hifi

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
test		test
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GrandSTR

Dependencies

Install

Usage & Examples

Input files

Commands

About

Releases

Packages

Contributors 2

Languages

Nextomics/GrandSTR

Folders and files

Latest commit

History

Repository files navigation

GrandSTR

Dependencies

Install

Usage & Examples

Input files

Commands

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages