How to properly select metrics&cutoffparameter #586

MaybeTommorow · 2025-01-23T07:21:53Z

Hi,

Thanks for developing scirpy, it's a powerful tool for analyzing TCR data.
I have a question regarding the selection of the metrics parameter when using scirpy.pp.ir_dist, scirpy.tl.ir_query, scirpy.tl.ir_query_annotate.

Assuming we are analyzing T cells and pooling TCR clones based on the amino acids (i.e., sequence = 'aa').
I noticed in your tutorial that tcrdist is used to compute CDR3 neighborhood graph, with a cutoff of 15 allowing 3 Rs mutating into N. However, while matching cells with VDJdb, identity is used.
Can I interpret this as a recommendation to use identity when matching data with public data, and tcrdist is preferable for computing similarity among a large pool of cells from the same specific condition?

Additionally, the default cutoff is 10 when using alignment, but in your tutorial, 15 is set when using tcrdist. Is this also a recommended value?

Thank you for your guidance.

The text was updated successfully, but these errors were encountered:

grst · 2025-01-28T18:49:55Z

Hi,

unfortunately, there is no straightforward answer. I am not aware of any benchmark that shows that at a given cutoff, receptors still have a X% chance of binding to the same epitope. So in the end it's just a gradient from "very stringent" (identity) to less stringent (alignment/tcrdist with increasing cutoffs).

This applies to both the definition of clonotype clusters and database search.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to properly select metrics&cutoffparameter #586

How to properly select metrics&cutoffparameter #586

MaybeTommorow commented Jan 23, 2025

grst commented Jan 28, 2025

How to properly select metrics&cutoffparameter #586

How to properly select metrics&cutoffparameter #586

Comments

MaybeTommorow commented Jan 23, 2025

grst commented Jan 28, 2025