Releases: BioinfoMachineLearning/PoseBench
Releases · BioinfoMachineLearning/PoseBench
v0.6.0
Additions:
- Added new baseline methods (AlphaFold 3, Chai-1 with multiple sequence alignments (MSAs))
- Added new binding site-focused implementation of
complex_alignment.py
based on PyMOL'salign
command, which in many cases yields 3x better docking evaluation scores for baseline methods - Added new script for analyzing baseline methods' protein conformational changes w.r.t. input (e.g., AlphaFold) protein structures and the corresponding reference (crystal) protein structures
- Added the new centroid RMSD and PLIF-EMD/WM metrics (n.b., see new arXiv preprint for more details)
- Added a failure mode analysis notebook (n.b., see new arXiv preprint for more details)
Changes:
- Introducing DockGen-E, a new version of the DockGen benchmark dataset featuring enhanced biomolecular context for docking and co-folding predictions - namely, now all DockGen complexes represent the first (biologically relevant) bioassembly of the corresponding PDB structure
- For the single-ligand datasets (i.e., Astex Diverse, PoseBusters Benchmark, and DockGen), now providing each baseline method with primary and cofactor ligand SMILES strings for prediction, to enhance the biomolecular context of these methods' predicted structures - as a result, for these single-ligand datasets, now the predicted ligand most similar to the primary ligand (in terms of both Tanimoto and structural similarity) is selected for scoring (which adds an additional layer of challenges for baseline methods)
- Updated Chai-1's inference code to commit
44375d5d4ea44c0b5b7204519e63f40b063e4a7c
, and ran it also with standardized (paired) MSAs - Replaced all AlphaFold 3 server predictions of each dataset's protein structures with predictions from AlphaFold 3's local inference code
Deprecations:
- Pocket-only benchmarking has been deprecated
Results:
- With all the above changed in place, simplified, re-ran, and re-analyzed all baseline methods for each benchmark dataset, and updated the baseline predictions and datasets (now containing standardized MSAs) hosted on Zenodo
- NOTE: The updated arXiv preprint should be publicly available by 02/12/2025
Full Changelog: v0.5.0...v0.6.0
v.0.5.0
What's Changed
Additions:
- Adds results with AlphaFold 3 predicted structures (now the default), which yield a 5-10% performance improvement over ESMFold structures on average
- Adds results for the new Chai-1 model from Chai Discovery
- Adds a new inference sweep pipeline for HPC clusters to allow users to quickly run an exhaustive sweep of all baseline methods, datasets, and tasks e.g., using generated batch scripts and a SLURM scheduler
Updates:
- Updates all Zenodo links to point to the latest version of the project's Zenodo record, which now includes the above-mentioned AlphaFold 3 predicted structures and baseline method results using them
- Updates documentation project-wide according to the additions listed above
Fixes:
- Fixes some CI testing issues
New Contributors
- @amorehead made their first contribution in #7
Full Changelog: v0.4.0...v0.5.0
v0.4.0
Full Changelog: v.0.3.0...v0.4.0
Updates:
- Renamed
src
root directory toposebench
to supportpip
packaging - Updated dataset documentation in
README.md
Additions:
- Added and documented
pip
installation option - Added mmCIF to PDB file conversion script
- Added apo-to-holo predicted protein structure accuracy assessment and plotting script
- Added support to
notebooks/dockgen_inference_results_plotting.ipynb
for analyzing the protein-ligand interactions within the PDBBind 2020 dataset's experimental structures
v0.3.0
v0.3.0 Release Notes
Additions:
- Added a notebook for plotting expanded DockGen benchmark results
- Added support for scoring relaxed-protein predictions
Corrections:
- Fixed runtime error for relaxed-protein energy minimization
- Fixed runtime error for compute benchmarking RoseTTAFold-All-Atom predictions
v0.2.0
v0.2.0 Release Notes
Additions:
- Added P2Rank as a new binding site prediction method available to use with AutoDock-Vina
- Added OpenJDK to the
PoseBench
Conda environment to enable P2Rank inference - Added a script to benchmark the required compute resources for each baseline method
Updates:
- Updated citation
Corrections:
- Corrected directory navigation instructions (i.e.,
cd
references) inREADME.md
to reflect the directory structure of each Zenodo archive file - Corrected Biopython, NumPy, and ProDy versions in the DiffDock Conda environment to avoid GCC compilation errors