A package that can be used to find out how well the epitopes in a patient's virus' will be recognized by the HLA's present in the patient.
There are two ways to install EpitopeMatcher:
- Use podman which is a container system build on docker. (this is the easy way)
- Install it into your OS like 'normal' software.
Install podman on your computer: https://podman.io/getting-started/installation
Optionally follow rootless mode instructions if you are a root user and want regular users of your system to be able to run EpitopeMatcher environments securely on their own.
Note: Podman is completely optional if you already have Docker installed. Podman will however take presidence if both are installed.
Clone the EpitopeMatcher repo:
git clone https://github.com/philliplab/EpitopeMatcher
Use the EpitopeMatcher (etm
) script to build the container and serve the shiny app:
cd EpitopeMatcher
./etm -h
./etm build
./etm serve
Big thanks to Dean Kayton (https://github.com/dnk8n) for contributing the container file and etm
script.
Make sure you have a recent version of R. Follow the instructions in the following link to set up the correct repositiory for apt: http://stackoverflow.com/questions/10476713/how-to-upgrade-r-in-ubuntu.
Make sure that both r-base and r-base-dev is installed
sudo apt-get install r-base r-base-dev
Next, install devtools' depedancies with apt-get:
sudo apt-get install libssl-dev libxml2-dev libcurl4-gnutls-dev
Then, from within R, install devtools and the BioConductor dependencies:
install.packages('devtools', repo = 'http://cran.rstudio.com/')
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("Biostrings")
Finally install the latest version of shiny and then EpitopeMatcher:
library(devtools)
install_github('rstudio/shiny')
install_github('philliplab/EpitopeMatcher')
To run the web UI:
library(EpitopeMatcher)
run_EpitopeMatcher_app()
To get some test data:
library(EpitopeMatcher)
get_set_of_test_data()
or download it from Test Data
The test data consists of 3 sample files:
- test_patient_hla_genotypes.csv as produced by get_test_patient_hla_data() which contains the details of the patient's HLA genotype.
- test_lanl_hla_data.csv as produced by get_test_lanl_hla_data() which contains the details of the hla genotypes (location in the genome, epitope etc).
- test_query_alignment.fasta as produced by get_test_query_alignment() which contains an alignment of sequences of the patient's quasispecies to HXB2.
To use EpitopeMatcher in an R session, see the help file of these functions:
- read_lanl_hla
- read_patient_hla
- read_query_alignment
- match_epitopes
Docker not available right now: Alternatively it can also be obtained using docker:
match_epitopes()
list_scores_to_compute()
score_all_epitopes()
output_results()
list_scores_to_compute()
matched_patients = match_patient_hla_to_query_alignment()
flat_lanl_hla = flatten_lanl_hla()
build_scoring_jobs(matched_patients, matched_hlas)
build_scoring_jobs(matched_patients, lanl_hla_data)
jobs = NULL
for (mp in matched_patients)
hla_details = get_hla_details(mp$..., lanl_hla_data)
jobs = c(jobs,
.Scoring_Job(hla_genotype,
patients,
hla_details))
score_all_epitopes()
for (job in …)
score_epitope()
score_epitope()
find_epitope_in_ref()
if not found()
log_epitope_not_found()
if found()
get_query_sequences()
align_ref_epitope_to_query_seqs()
log_epitope_found()
- The input data is named and used in this order:
- query_alignment
- patient_hla
- lanl_hla
- The way to refer to a query sequence is by it's full FASTA header. Not the patient_id extracted from it nor it's position (index) in the alignment.
- Error Logging. Probably not the best design, but it should be good enough. Let each function that should log errors return as output a list with elements: 'msg', 'result', and 'error_logs' where 'error_logs' is again a list each of whom's elements is a data.frame that logs a specific type of error. This design should allow the users to inspect the error logs in EXCEL quite comfortably. A better design might be to produce traditional logs using a standard logging library and then to process those logs at a later stage in easy to analyze formats, but in the short term this is more work.