MiGEC: Molecular Identifier Guided Error Correction pipeline

This pipeline provides several useful tools for analysis of immune repertoire sequencing data. Its main feature is the ability to use information from unique nucleotide tags (UMIs, see this paper for details), which are attached to molecules before sequencing library preparation and allow to backtrack the original sequence of molecule. UMIs make it possible to computationally filter nearly all experimental errors from resulting immune receptor sequences.

This pipeline was designed for libraries sequenced using Illumina MiSeq and HiSeq and the main requirement for sequencing reads is that they should contain the entire CDR3 region of immune receptor gene. Sequencing libraries with high over-sequencing, i.e. ones that have 5+ reads per starting molecule (unique UMI tag), should be used for optimal error elimination.

Several modules of the pipeline, such as de-multiplexing and CDR3 extraction could be utilized for a wider range of datasets.

For more details please see the paper describing MiGEC.

Full documentation is provided via ReadTheDocs. You might be also interested in taking the following tutorial.

Please cite the tool as:

Shugay M et al. Towards error-free profiling of immune repertoires. Nature Methods 11, 653–655 (2014)

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
doc		doc
src		src
util		util
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
migec		migec
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiGEC: Molecular Identifier Guided Error Correction pipeline

About

Releases

Packages

Languages

License

MiTPenguin/migec

Folders and files

Latest commit

History

Repository files navigation

MiGEC: Molecular Identifier Guided Error Correction pipeline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages