SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors) is a small, fast, and accurate TCR representation model that can be used for alignment-free TCR analysis, including for TCR-pMHC interaction prediction and TCR clustering (metaclonotype discovery). Our manuscript demonstrates that SCEPTR can be used for few-shot TCR specificity prediction with improved accuracy over previous methods.
SCEPTR is a BERT-like transformer-based neural network implemented in Pytorch. With the default model providing best-in-class performance with only 153,108 parameters (typical protein language models have tens or hundreds of millions), SCEPTR runs fast- even on a CPU! And if your computer does have a CUDA-enabled GPU, the sceptr package will automatically detect and use it, giving you blazingly fast performance without the hassle.
sceptr's API exposes three intuitive functions: calc_vector_representations
, calc_cdist_matrix
, and calc_pdist_vector
- and it's all you need to make full use of the SCEPTR models.
What's even better is that they are fully compliant with pyrepseq's tcr_metric API, so sceptr will fit snugly into the rest of your repertoire analysis workflow.
pip install sceptr
Please cite our manuscript.
@article{nagano_contrastive_2025,
title = {Contrastive learning of {T} cell receptor representations},
volume = {16},
issn = {2405-4712, 2405-4720},
url = {https://www.cell.com/cell-systems/abstract/S2405-4712(24)00369-7},
doi = {10.1016/j.cels.2024.12.006},
language = {English},
number = {1},
urldate = {2025-01-19},
journal = {Cell Systems},
author = {Nagano, Yuta and Pyo, Andrew G. T. and Milighetti, Martina and Henderson, James and Shawe-Taylor, John and Chain, Benny and Tiffeau-Mayer, Andreas},
month = jan,
year = {2025},
pmid = {39778580},
note = {Publisher: Elsevier},
keywords = {contrastive learning, protein language models, representation learning, T cell receptor, T cell specificity, TCR, TCR repertoire},
}