Merge pull request #229 from forlilab/rtd-wip-lig

Update docs
forlilab · Nov 11, 2024 · 09e768a · 09e768a
2 parents ba31e10 + 635f338
commit 09e768a
Show file tree

Hide file tree

Showing 25 changed files with 917 additions and 524 deletions.
diff --git a/README.md b/README.md
@@ -1,286 +1,67 @@
-# Meeko: preparation of small molecules for AutoDock
+# Meeko: interface for AutoDock
 
 [![API stability](https://img.shields.io/badge/stable%20API-no-orange)](https://shields.io/)
 [![PyPI version fury.io](https://img.shields.io/badge/version-0.6.0-green.svg)](https://pypi.python.org/pypi/meeko/)
 [![Documentation Status](https://readthedocs.org/projects/meeko/badge/?version=readthedocs)](https://meeko.readthedocs.io/en/readthedocs/?badge=readthedocs)
 
-Meeko reads an RDKit molecule object and writes a PDBQT string (or file)
-for [AutoDock-Vina](https://github.com/ccsb-scripps/AutoDock-Vina)
-and [AutoDock-GPU](https://github.com/ccsb-scripps/AutoDock-GPU).
-It converts the docking output to RDKit molecules and
-SD files, without loss of bond orders.
+Meeko prepares the input for AutoDock and processes its output.
+It is developed alongside AutoDock-GPU and AutoDock-Vina.
+Meeko parameterizes both small organic molecules (ligands) and proteins
+and nucleic acids (receptors).
 
 Meeko is developed by the [Forli lab](https://forlilab.org/) at the
 [Center for Computational Structural Biology (CCSB)](https://ccsb.scripps.edu)
 at [Scripps Research](https://www.scripps.edu/).
 
 
-## Usage notes
+## Documentation
 
-Meeko does not calculate 3D coordinates or assign protonation states.
-Input molecules must have explicit hydrogens.
+The docs are hosted on [meeko.readthedocs.io](meeko.readthedocs.io)
 
-Sampling of macrocycle conformers ([paper](https://doi.org/10.1017/qrd.2022.18))
-is enabled by default.
 
-SDF format strongly preferred over Mol2.
-See
-[this discussion](https://github.com/rdkit/rdkit/discussions/3647), and
-[this one](https://sourceforge.net/p/rdkit/mailman/message/37668451/),
-[also this](rdkit/rdkit#4061),
-[and this](https://sourceforge.net/p/rdkit/mailman/message/37374678/).
-and RDKit issues
-[1755](https://github.com/rdkit/rdkit/issues/1755) and
-[917](https://github.com/rdkit/rdkit/issues/917). So, what could go wrong?
-For example, reading Mol2 files from ZINC
-[led to incorrect net charge of some molecules.](https://github.com/forlilab/Meeko/issues/63)
+## Reporting bugs
 
+Please check if a similar bug has been reported and, if not, [open an issue](https://github.com/forlilab/Meeko/issues).
 
-## v0.6.0-alpha.3
 
-This release aims to distribute an enhanced `mk_prepare_receptor.py`.
-Some features are still being developed, hence the `-alpha` suffix in the version.
-Reactive docking is not working in v0.6.0-alpha, but should be restored soon.
-Documentation is also work in progress.
+## Installation
 
+Visit the docs for a more complete description. One option is conda or mamba:
 
-## API changes in v0.5
-
-Class `MoleculePreparation` no longer has method `write_pdbqt_string()`.
-Instead, `MoleculePreparation.prepare()` returns a list of `MoleculeSetup` objects
-that must be passed, individually, to `PDBQTWriterLegacy.write_string()`.
-```python
-from meeko import MoleculePreparation
-from meeko import PDBQTWriterLegacy
-
-preparator = MoleculePreparation()
-mol_setups = preparator.prepare(rdkit_molecule_3D_with_Hs)
-for setup in mol_setups:
-    pdbqt_string, is_ok, error_msg = PDBQTWriterLegacy.write_string(setup)
-    if is_ok:
-        print(pdbqt_string, end="")
-```
-
-Argument `keep_nonpolar_hydrogens` is replaced by `merge_these_atom_types`, both in the Python
-interface and for script `mk_prepare_ligand.py`.
-The default is `merge_these_atom_types=("H",)`, which
-merges hydrogens typed `"H"`, keeping the current default behavior.
-To keep all hydrogens, set `merge_these_atom_types` to an empty
-list when initializing `MoleculePreparation`, or pass no atom types
-to `--merge_these_atom_types` from the command line:
-```sh
-mk_prepare_ligand.py -i molecule.sdf --merge_these_atom_types
-``` 
-
-## Dependencies
-
-* Python (>=3.5)
-* Numpy
-* Scipy
-* RDKit (>=2023.09)
-* ProDy (optionally, for covalent docking)
-
-Conda or Miniconda can install the dependencies:
 ```bash
-conda install -c conda-forge numpy scipy rdkit
-pip install prody # optional. pip recommended at http://prody.csb.pitt.edu/downloads/
-```
-
-
-## Python tutorial
-
-
-#### 2. RDKit molecule from docking results
-
-```python
-from meeko import PDBQTMolecule
-from meeko import RDKitMolCreate
-
-fn = "autodock-gpu_results.dlg"
-pdbqt_mol = PDBQTMolecule.from_file(fn, is_dlg=True, skip_typing=True)
-rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
-```
-The length of `rdkitmol_list` is one if there are no sidechains and only one
-ligand was docked.
-If multiple ligands and/or sidechains are docked simultaneously, each will be
-an individual RDKit molecule in `rdkitmol_list`.
-Sidechains are truncated at the C-alpha.
-Note that docking multiple
-ligands simultaneously is only available in Vina, and it differs from docking
-multiple ligands one after the other. Each failed creation of an RDKit molecule
-for a ligand or sidechain results in a `None` in `rdkitmol_list`.
-
-For Vina's output PDBQT files, omit `is_dlg=True`.
-```python
-pdbqt_mol = PDBQTMolecule.from_file("vina_results.pdbqt", skip_typing=True)
-rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
-```
-
-When using Vina from Python, the output string can be passed directly.
-See [the docs](https://autodock-vina.readthedocs.io/en/latest/docking_python.html)
-for context on the `v` object.
-```python
-vina_output_string = v.poses()
-pdbqt_mol = PDBQTMolecule(vina_output_string, is_dlg=True, skip_typing=True)
-rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
+micromamba install meeko
 ```
 
-#### 3. Initializing MoleculePreparation from a dictionary (or JSON)
-
-This is useful for saving and loading configuration files with json.
-```python
-import json
-from meeko import MoleculePreparation
-
-mk_config = {"hydrate": True} # any arguments of MoleculePreparation.__init__
-print(json.dumps(mk_config), file=open('mk_config.json', 'w'))
-with open('mk_config.json') as f:
-    mk_config = json.load(f)
-preparator = MoleculePreparation.from_config(mk_config)
-```
-The command line tool `mk_prepare_ligand.py` can read the json files using
-option `-c` or `--config`.
-
-
-## Possibly useful configurations of MoleculePreparation
-
-Here we create an instance of MoleculePreparation that attaches pseudo
-waters to the ligand ([see paper on hydrated docking](https://pubs.acs.org/doi/abs/10.1021/jm2005145)),
-keeps macrocycles rigid (generally a bad idea),
-and keeps conjugated bonds and amide bonds rigid. 
-By default, most amides are kept rigid, except for tertiary amides with
-different substituents on the nitrogen.
+or from PyPI:
 
-```python
-preparator = MoleculePreparation(
-    hydrate=True,
-    rigid_macrocycles=True,
-    rigidify_bonds_smarts = ["C=CC=C", "[CX3](=O)[NX3]"],
-    rigidify_bonds_indices = [(1, 2), (0, 2)],
-)
-```
-
-The same can be done with the command line script. Note that indices of the
-atoms in the SMARTS are 0-based for the Python API but
-1-based for the command line script:
-```console
-mk_prepare_ligand.py\
-    --hydrate\
-    --rigid_macrocycles\
-    --rigidify_bonds_smarts "C=CC=C"\
-    --rigidify_bonds_indices 2 3\
-    --rigidify_bonds_smarts "[CX3](=O)[NX3]"\
-    --rigidify_bonds_indices 1 3
+```bash
+pip install meeko
 ```
 
-## Docking covalent ligands as flexible sidechains
+## Usage
 
-The input ligand must be the product of the reaction and contain the
-atoms of the flexible sidechain up to (and including) the C-alpha.
-For example, if the ligand is an acrylamide (smiles: `C=CC(=O)N`) reacting
-with a cysteine (sidechain smiles: `CCS`), then the input ligand for
-Meeko must correspond to smiles `CCSCCC(=O)N`.
+Meeko exposes a Python API to enable scripting. Here we share very minimal examples
+using the command line scripts just to give context.
+Please visit the [meeko.readthedocs.io](meeko.readthedocs.io) for more information.
 
-Meeko will align the ligand atoms that match the C-alpha and C-beta of
-the protein sidechain. Options `--tether_smarts` and `--tether_smarts_indices`
-define these atoms. For a cysteine, `--tether_smarts "SCC"` and
-`--tether_smarts_indices 3 2` would work, although it is safer to define
-a more spefic SMARTS to avoid matching the ligand more than once. The first
-index (3 in this example) defines the C-alpha, and the second index defines
-the C-beta. 
-
-For the example in this repository, which is based on PDB entry 7aeh,
-the following options prepare the covalently bound ligand for tethered docking:
-```console
-cd example/covalent_docking
-
-mk_prepare_ligand.py\
-    -i ligand_including_cys_sidechain.sdf\
-    --receptor protein.pdb\
-    --rec_residue ":CYS:6"\
-    --tether_smarts "NC(=O)C(O)(C)SCC"\
-    --tether_smarts_indices 9 8\
-    -o prepared.pdbqt
+Parameterizing a ligand and writing a PDBQT file:
+```bash
+mk_prepare_ligand.py -i molecule.sdf -o molecule.pdbqt
 ```
 
-Prody is required for preparing ligands as flexible sidechains.
-Often, `pip install prody` suffices to install Prody in the currently
-active virtual environment (e.g., conda). For more detailed installation
-instructions visit http://prody.csb.pitt.edu/downloads/.
+Parameterizing a receptor with a flexible sidechain and writing a PDBQT file
+as well as a JSON file that stores the entire receptor datastructure. In this
+example, the `-o` option sets the output base name, `-j` triggers writing the
+.json file, `-p` triggers writting the .pdbqt file, and `-f` makes residue
+42 in chain A flexible.
 
-## Reactive Docking
-
-### 1. Prepare protein with waterkit
-Follow `wk_prepare_receptor.py` instructions and run with `--pdb`.
-The goal of this step is to perform essential fixes to the protein
-(such as missing atoms), to add hydrogens, and to follow the Amber
-naming scheme for atoms and residues, e.g., `HIE` or `HID`
-instead of `HIS`.
-
-### 2. Prepare protein pdbqt
-Here, `wk.pdb` was written by waterkit. The example below will center a gridbox of specified size on the given reactive residue.
-
-```console
-   $ mk_prepare_receptor.py\
-    --pdb wk.pdb\
-    -o receptor.pdbqt\
-    --flexres " :ARG:348"\
-    --reactive_flexres " :SER:308"
-    --reactive_flexres " :SER:308"\
-    --box_center_on_reactive_res\
-    --box_size 40 40 40  # x y z (angstroms)
-```
-A manual box center can be specified with `--box_center`.
-Reactive parameters can also be modified:
-```sh
-  --r_eq_12 R_EQ_12     r_eq for reactive atoms (1-2 interaction)
-  --eps_12 EPS_12       epsilon for reactive atoms (1-2 interaction)
-  --r_eq_13_scaling R_EQ_13_SCALING
-                        r_eq scaling for 1-3 interaction across reactive atoms
-  --r_eq_14_scaling R_EQ_14_SCALING
-                        r_eq scaling for 1-4 interaction across reactive atoms
+```bash
+mk_prepare_receptor.py -i nucleic_acid.cif -o my_receptor -j -p -f A:42
 ```
 
-Receptor preparation can't handle heteroatoms for the moment.
-Also nucleic acids, ions, and post-translational modifications (e.g.
-phosphorilation) are not supported. Only the 20 standard amino acids
-can be parsed, and it is required to have Amber atom names and
-hydrogens. No atoms can be missing.
-
-### 3. Run autogrid
+Finally, converting docking results to SDF for the ligand, and PDB for the
+receptor with updated sidechain positions:
 
-Make affinity maps for the `_rigid.pdbqt` part of the receptor.
-Make affinity maps for the `_rigid.pdbqt` part of the receptor. `mk_prepare_receptor.py` will prepare the GPF for you.
-
-### 4. Write ligand PDBQT
-mk_prepare_ligand.py -i sufex1.sdf --reactive_smarts "S(=O)(=O)F" --reactive_smarts_idx 1 -o sufex1.pdbqt\
-
-### 5. Configure AD-GPU for reactive docking
-
-For reactive docking there are two options that need to be passed to AutoDock-GPU:
-For reactive docking there are an additional option that needs to be passed to AutoDock-GPU:
-    ```console
-    --import_dpf
-    ```
-
-The `--derivtype` option, if needed, was written by `mk_prepare_receptor.py` to a file suffixed with `.derivtype`.
-
-The filename to be passed to `--import_dpf` was written by `mk_prepare_receptor.py`
-and it is suffixed with `reactive_config`.
-```sh
-ADGPU -I *.reactive_config -L sufex1.pdbqt -N sufex1_docked_ -F *_flex.pdbqt -C 1
+```bash
+mk_export.py vina_results.pdbqt -j my_receptor.json -s lig_docked.sdf -p rec_docked.pdb
 ```
-
-## Polymer docs temporary placeholder
-
-Current init is a builder method designed to parse PDB files and topology
-objects. It is not designed to monomerize a polymer that already exists
-as a single RDKit molecule, or to build a single polymer molecule from the
-individual residues (monomers). To do that, one would have to react residues
-with each other. Padding could be adapted.
-
-
-How bonds between residues are handled and blunt residues, and deleting
-bonds "interactively".
-
-Document residue\_chem\_params.
diff --git a/docs/source/cli_export_result.rst b/docs/source/cli_export_result.rst
diff --git a/docs/source/cli_lig_prep.rst b/docs/source/cli_lig_prep.rst