immunarch
--- Fast and Seamless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires in R
- Work with any type of data: single-cell, bulk, data tables, databases --- you name it.
- Community at the heart: ask questions, share knowledge and thrive in the community of almost 30,000 researchers and medical scientists worldwide. Pfizer, Novartis, Regeneron, Stanford, UCSF and MIT trust us.
- One plot --- one line: write a whole PhD thesis in 8 lines of code or reproduce almost any publication in 5-10 lines of
immunarch
code. - Be on the bleeding edge of science: we regularly update
immunarch
with the latest methods. Let us know what you need! - Automatic format detection and parsing for all popular immunosequencing formats: from MiXCR and ImmunoSEQ to 10XGenomics and ArcherDX.
install.packages("immunarch") # Install the package
library(immunarch); data(immdata) # Load the package and the test dataset
repOverlap(immdata$data) %>% vis() # Compute and visualise the most important statistics:
geneUsage(immdata$data[[1]]) %>% vis() # public clonotypes, gene usage, sample diversity
repDiversity(immdata$data) %>% vis(.by = "Status", .meta = immdata$meta) # Group samples
immunarch
is brought to you by ImmunoMind --- a UC Berkeley SkyDeck startup. ImmunoMind Data Science tools for single-cell and immunomics exploration and biomarker discovery are trusted by researchers from top pharma companies and universities, including 10X Genomics, Pfizer, Regeneron, UCSF, MIT, Stanford, John Hopkins School of Medicine and Vanderbilt University.
Connect!
immunarch
is an R package designed to analyse T-cell receptor (TCR) and B-cell receptor (BCR) repertoires, aimed at medical scientists and bioinformaticians. The mission of immunarch
is to make immune sequencing data analysis as effortless as possible and help you focus on research instead of coding.
Create a ticket with a bug or question on GitHub Issues to help the community help you and enrich it with your experience. If you need to send us a sensitive data, feel free to contact us via [email protected].
In order to install immunarch
execute the following command:
install.packages("immunarch")
That's it, you can start using immunarch
now! See the Quick Start section below to dive into immune repertoire data analysis. If you run in any trouble with installation, take a look at the Installation Troubleshooting section.
Note: there are quite a lot of dependencies to install with the package because it installs all the widely-used packages for data analysis and visualisation. You got both the AIRR data analysis framework and the full Data Science package ecosystem with only one command, making immunarch
the entry-point for single-cell & immune repertoire Data Science.
If the above command doesn't work for any reason, try installing immunarch
directly from its repository:
install.packages("devtools") # skip this if you already installed devtools
devtools::install_github("immunomind/immunarch")
Since releasing on CRAN is limited to one release per one-two months, you can install the latest pre-release version with bleeding edge features and optimisations directly from the code repository. In order to install the latest pre-release version, you need to execute only two commands:
install.packages("devtools") # skip this if you already installed devtools
devtools::install_github("immunomind/immunarch", ref="dev")
You can find the list of releases of immunarch
here: https://github.com/immunomind/immunarch/releases
-
Fast and easy manipulation of immune repertoire data:
-
The package automatically detects the format of your files---no more guessing what format is that file, just pass them to the package;
-
Supports all popular TCR and BCR analysis and post-analysis formats, including single-cell data: ImmunoSEQ, IMGT, MiTCR, MiXCR, MiGEC, MigMap, VDJtools, tcR, AIRR, 10XGenomics, ArcherDX. More coming in the future;
-
Works on any data source you are comfortable with: R data frames, data tables from data.table, databases like MonetDB, Apache Spark data frames via sparklyr;
-
Tutorial is available here.
-
-
Immune repertoire analysis made simple:
-
Most methods are incorporated in a couple of main functions with clear naming---no more remembering tens and tens of functions with obscure names. For details see link;
-
Repertoire overlap analysis (common indices including overlap coefficient, Jaccard index and Morisita's overlap index). Tutorial is available here;
-
Gene usage estimation (correlation, Jensen-Shannon Divergence, clustering). Tutorial is available here;
-
Diversity evaluation (ecological diversity index, Gini index, inverse Simpson index, rarefaction analysis). Tutorial is available here;
-
Tracking of clonotypes across time points, widely used in vaccination and cancer immunology domains. Tutorial is available here;
-
Kmer distribution measures and statistics. Tutorial is available here;
-
Coming in the next releases: CDR3 amino acid physical and chemical properties assessment, mutation networks.
-
-
Publication-ready plots with a built-in tool for visualisation manipulation:
The gist of the typical TCR or BCR data analysis workflow can be reduced to the next few lines of code.
1) Load the package and the data
library(immunarch) # Load the package into R
data(immdata) # Load the test dataset
2) Calculate and visualise basic statistics
repExplore(immdata$data, "lens") %>% vis() # Visualise the length distribution of CDR3
repClonality(immdata$data, "homeo") %>% vis() # Visualise the relative abundance of clonotypes
3) Explore and compare T-cell and B-cell repertoires
repOverlap(immdata$data) %>% vis() # Build the heatmap of public clonotypes shared between repertoires
geneUsage(immdata$data[[1]]) %>% vis() # Visualise the V-gene distribution for the first repertoire
repDiversity(immdata$data) %>% vis(.by = "Status", .meta = immdata$meta) # Visualise the Chao1 diversity of repertoires, grouped by the patient status
library(immunarch) # Load the package into R
immdata <- repLoad("path/to/your/data") # Replace it with the path to your data. Immunarch automatically detects the file format.
For advanced methods such as clonotype annotation, clonotype tracking, kmer analysis and public repertoire analysis see "Tutorials".
The mission of immunarch
is to make bulk and single-cell immune repertoires analysis painless. All bug reports, documentation improvements, enhancements and ideas are appreciated. Just let us know via GitHub (preferably) or [email protected] (in case of private data).
Bug reports must:
- Include a short, self-contained R snippet reproducing the problem.
- Add a minimal data sample for us to reproduce the problem. In case of sensitive data you can send it to [email protected] instead of GitHub issues.
- Explain why the current behavior is wrong/not desired and what you expect instead.
- If the issue is about visualisations, please attach a picture to the issue. In other case we wouldn't be able to reproduce the bug and fix it.
Have an aspiration to help the community build the ecosystem of scRNAseq & AIRR analysis tools? Found a bug? A typo? Would like to improve a documentation, add a method or optimise an algorithm?
We are always open to contributions. There are two ways to contribute:
-
Create an issue here and describe what would you like to improve or discuss.
-
Create an issue or find one here, fork the repository and make a pull request with the bugfix or improvement.
ImmunoMind Team. (2019). immunarch: An R Package for Painless Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires. Zenodo. http://doi.org/10.5281/zenodo.3367200
BibTex:
@misc{immunomind_team_2019_3367200,
author = {{ImmunoMind Team}},
title = {{immunarch: An R Package for Painless Bioinformatics Analysis
of T-Cell and B-Cell Immune Repertoires}},
month = aug,
year = 2019,
doi = {10.5281/zenodo.3367200},
url = {https://doi.org/10.5281/zenodo.3367200}
}
For EndNote citation import the immunarch-citation.xml
file.
Preprint on BioArxiv is coming soon.
The package is freely distributed under the AGPL v3 license. You can read more about it here.
For commercial or server use, please contact ImmunoMind via [email protected] about solutions for biomarker data science of single-cell immune repertoires.