Skip to content

A repository of pipelines for running the EnsEMBL VEP

Notifications You must be signed in to change notification settings

alliance-genome/agr_vep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Alliance of Genome Resources VEP pipelines

A collection of pipelines for tasks associated with VEP annotation of variations. The code contained within this repository is designed to run on the EBI's compute farm using the EnsEMBL Hive pipeline manager.

See the README files in each folder for details of the individual pipelines.

VepProteinFunction

  • Generates databases of SIFT and PolyPhen scores and predictions for the MOD species.
  • Databases are accessed by a VEP plugin to add SIFT/PolyPhen annotation to the VEP output.
  • Translated protein sequences are constructed from FASTA, GFF, and (optionally) BAM files.
  • Serialized prediction matrices are stored containing all possible amino acid substitutions for each sequence, accessed by the hex md5 of the sequence.
  • FULL mode generates the database from scratch.
  • UPDATE mode checks for existing prediction matrices, or sequences for which there were valid reasons for being unable to generate protein function annotations, and updates the database for new sequences only.

HumanVep

  • Runs VEP on human variant VCF file (obtained from the EnsEMBL FTP site).
  • Splits input files, runs VEP in parallel, then combines the output.
  • Uses the EnsEMBL merged (EnsEMBL & RefSeq) cached database to retrieve VEP annotations.

ModVep

  • Runs VEP on MOD high throughput variation VCF files
  • Splits input files, runs VEP in parallel, then combines the output.
  • Uses MOD GFF, FASTA, and (optionally) BAM files to construct translated protein sequences.
  • Retrieves SIFT and PolyPhen annotations from databases generated by the VepProteinFunction pipeline.

About

A repository of pipelines for running the EnsEMBL VEP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published