Prediction of potential inhibitors for the Sars-CoV-2 Helicase (Nsp13) by virtual screening and MD simulations
This repository contains code and results for the virtual screening campaign of the European Chemical Biology Database (ECBD) followed up by MD simulations to predict potential NSP13 helicase inhibitors. Results of this study may act as a starting point for anti-viral drug development targeting NSP13.
- Nsp13Structures: provided NSP13 structures, saved as separated chains
- pocket_detection: output of Fpocket and P2Rank softwares for binding sites detection and results comparison, final pockets (Final detexted binding pockets.pdf), scripts for visualization in PyMOL
- substances: csv files of the full and filteres ECBD
- docking: docking results obtained with AutoDock Vina
- md-simulations: molecular dynamics results. Due to their large size trajectories and other files are only available on request.
- src: python scripts to run the analyses, when following instructions described below
- env.yml file is the Python enviroment file
Structures used for the study: 7NIO and 7NN0
First, P2Rank and Fpocket were used to predict the binding sites for both NSP13 structures. overlay_pockets.py script then can be run to calculate the overlay in the results from different softwares. cluster_pockets.py produces plots that assist manual merging of the overlayed results.
ECBD was used as a starting point and then was checked for toxicity and pharmacokinetic properties with ADMETlab web tool. substances_filter.py script filters then the ADMETlab output based on acceptance of at least 3 out of 4 rules (Lipinski Rule, Pfizer Rule, GSK Rule and Golden Triangle Rule).
docking.py script takes the list of ligand in form of SMILES and the .pdbqt file for protein and runs the docking with AutoDock Vina software, script has also the refinement option corresponding to the docking run with the higher exhaustiveness. Docking directly from SMILES strings is possible thanks to the Meeko Python package. docking_postprocess.py takes the docking output, filters and sorts the results. For easier post-processing the pipeline defined by the Snakemake file can be used.
MD_simulation_workflow.ipynb jupyter notebook describes step by step the whole protein and ligand preparation process, necessary .mdp files can be found in the MD folder This process was based on the GROMACS tutorial by Justin A. Lemkul, Ph.D.
List of variants was downloaded. get_mutations.py script was used then to find the mutations in the protein sequence and visualize them in Pymol.
Docking_results_comparison.ipynb includes analysis of our docking results and comparison of our results with results from Prague Team 2