Official repository for DeepAb: Antibody structure prediction using interpretable deep learning. The code, data, and weights for this work are made available under the Rosetta-DL license as part of the Rosetta-DL bundle.
Optional: Create and activate a python virtual environment
python3 -m venv venv
source venv/bin/activate
Install project dependencies
pip install -r requirements.txt
Note: PyRosetta should be installed following the instructions here.
Download pretrained model weights
wget https://data.graylab.jhu.edu/ensemble_abresnet_v1.tar.gz
tar -xf ensemble_abresnet_v1.tar.gz
After unzipping, pre-trained models might need to be moved such that they have paths trained_models/ensemble_abresnet/rs*.pt
Additional options for all scripts are available by running with --help
.
Note: This project is tested with Python 3.7.9
Generate an antibody structure prediction from an Fv sequence with five decoys:
python predict.py data/sample_files/4h0h.fasta --decoys 5 --renumber
Generate a structure for a single heavy or light chain:
python predict.py data/sample_files/4h0h.fasta --decoys 5 --single_chain
Note: The fasta file should contain a single entry labeled "H" (even if the sequence is a light chain).
Expected output
After the script completes, the final prediction will be saved as pred.deepab.pdb
. The numbered decoy structures will be stored in the decoys/
directory.
Annotate an Fv structure with H3 attention:
python annotate_attention.py data/sample_files/4h0h.truncated.pdb --renumber --cdr_loop h3
Note: CDR loop residues are determined using Chothia definitions, so the input structure should be numbered beforehand or renumbered by passing --renumber
Expected output
After the script completes, the annotated PDB will overwrite the input file (unless --out_file
is specificed). Annotations will be stored as b-factor information, and can be visualized in PyMOL or similar software.
Calculate ΔCCE for list of designed sequences:
python score_design.py data/sample_files/wt.fasta data/sample_files/h_mut_seqs.fasta data/sample_files/l_mut_seqs.fasta design_out.csv
Expected output
After the script completes, the designs and scores will be written to a CSV file with each row containing the design ID, heavy chain sequence, light chain sequence, and ΔCCE value.
[1] JA Ruffolo, J Sulam, and JJ Gray. "Antibody structure prediction using interpretable deep learning." Patterns (2021).