Skip to content
forked from brevans/vcf2fa

Given a vcf and a set of bam files, generate consensus sequences for each individual (taking into account areas of low/no coverage)

License

Notifications You must be signed in to change notification settings

singuyenmai/vcf2fa

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vcf2fa

Given a vcf and a set of bam files, generate consensus sequences for each individual (taking into account areas of low/no coverage)

###Required Software:

###Prerequisites:

  • reference fasta
  • sorted, indexed bam files for each sample, mapped to your reference
  • vcf file corresponding to bam files e.g.
samtools mpileup -uDf ref.fasta *.bam | bcftools view -vcg - > var.vcf

###STEP 1: Caveat: samples MUST be in the same order in the vcf as they are in the multicov bed file produced by bedtools!!

#Generate a matrix of depth of coverage per sample per position:
./gen_bed_files.py reference.fa
bedtools multicov -bams path/to/bams/*.bam -bed reference_single_base.bed > multicov.bed

###STEP 2:

./vcf2fa.py --min_cov 18 --multicov_file multicov.bed --vcf_file var.vcf
#will generate a subfolder in the current directory of fasta files, each one a locus, containing all individuals found in the vcf/bed files

About

Given a vcf and a set of bam files, generate consensus sequences for each individual (taking into account areas of low/no coverage)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%