Skip to content

Simple Contrastive Embedding of the Primary sequence of T cell Receptors

License

Notifications You must be signed in to change notification settings

yutanagano/sceptr

Repository files navigation

Latest release Tests Documentation Status License DOI

Check out the documentation page.


SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors) is a small, fast, and accurate TCR representation model that can be used for alignment-free TCR analysis, including for TCR-pMHC interaction prediction and TCR clustering (metaclonotype discovery). Our manuscript demonstrates that SCEPTR can be used for few-shot TCR specificity prediction with improved accuracy over previous methods.

SCEPTR is a BERT-like transformer-based neural network implemented in Pytorch. With the default model providing best-in-class performance with only 153,108 parameters (typical protein language models have tens or hundreds of millions), SCEPTR runs fast- even on a CPU! And if your computer does have a CUDA-enabled GPU, the sceptr package will automatically detect and use it, giving you blazingly fast performance without the hassle.

sceptr's API exposes three intuitive functions: calc_vector_representations, calc_cdist_matrix, and calc_pdist_vector- and it's all you need to make full use of the SCEPTR models. What's even better is that they are fully compliant with pyrepseq's tcr_metric API, so sceptr will fit snugly into the rest of your repertoire analysis workflow.

Installation

pip install sceptr

Citing SCEPTR

Please cite our manuscript.

BibTex

@article{nagano_contrastive_2025,
	title = {Contrastive learning of {T} cell receptor representations},
	volume = {16},
	issn = {2405-4712, 2405-4720},
	url = {https://www.cell.com/cell-systems/abstract/S2405-4712(24)00369-7},
	doi = {10.1016/j.cels.2024.12.006},
	language = {English},
	number = {1},
	urldate = {2025-01-19},
	journal = {Cell Systems},
	author = {Nagano, Yuta and Pyo, Andrew G. T. and Milighetti, Martina and Henderson, James and Shawe-Taylor, John and Chain, Benny and Tiffeau-Mayer, Andreas},
	month = jan,
	year = {2025},
	pmid = {39778580},
	note = {Publisher: Elsevier},
	keywords = {contrastive learning, protein language models, representation learning, T cell receptor, T cell specificity, TCR, TCR repertoire},
}