diff --git a/.gitignore b/.gitignore
index 44e91bf2..75415f85 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,4 @@
+# MacOS system files
*.DS_Store
# Byte-compiled / optimized / DLL files
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
new file mode 100644
index 00000000..e952a46c
--- /dev/null
+++ b/.readthedocs.yaml
@@ -0,0 +1,35 @@
+# Read the Docs configuration file for Sphinx projects
+# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
+
+# Required
+version: 2
+
+# Set the OS, Python version and other tools you might need
+build:
+ os: ubuntu-22.04
+ tools:
+ python: "3.12"
+ # You can also specify other tool versions:
+ # nodejs: "20"
+ # rust: "1.70"
+ # golang: "1.20"
+
+# Build documentation in the "docs/" directory with Sphinx
+sphinx:
+ configuration: docs/source/conf.py
+ # You can configure Sphinx to use a different builder, for instance use the dirhtml builder for simpler URLs
+ # builder: "dirhtml"
+ # Fail on all warnings to avoid broken references
+ # fail_on_warning: true
+
+# Optionally build your docs in additional formats such as PDF and ePub
+# formats:
+# - pdf
+# - epub
+
+# Optional but recommended, declare the Python requirements required
+# to build your documentation
+# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
+python:
+ install:
+ - requirements: docs/requirements.txt
diff --git a/README.md b/README.md
index 818872f9..871b5b0b 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,7 @@
[![API stability](https://img.shields.io/badge/stable%20API-no-orange)](https://shields.io/)
[![PyPI version fury.io](https://img.shields.io/badge/version-0.6.0-green.svg)](https://pypi.python.org/pypi/meeko/)
+[![Documentation Status](https://readthedocs.org/projects/meeko/badge/?version=readthedocs)](https://meeko.readthedocs.io/en/readthedocs/?badge=readthedocs)
Meeko reads an RDKit molecule object and writes a PDBQT string (or file)
for [AutoDock-Vina](https://github.com/ccsb-scripps/AutoDock-Vina)
@@ -85,85 +86,9 @@ conda install -c conda-forge numpy scipy rdkit
pip install prody # optional. pip recommended at http://prody.csb.pitt.edu/downloads/
```
-## Installation (from PyPI)
-```bash
-$ pip install meeko
-```
-If using conda, `pip` installs the package in the active environment.
-
-## Installation (from source)
-You'll get the develop branch, which may be ahead of the latest release.
-```bash
-$ git clone https://github.com/forlilab/Meeko
-$ cd Meeko
-$ pip install .
-```
-
-Optionally include `--editable`. Changes in the original package location
-take effect immediately without the need to run `pip install .` again.
-```bash
-$ pip install --editable .
-```
-
-
-## Examples using the command line scripts
-
-#### 1. make PDBQT files
-AutoDock-GPU and Vina read molecules in the PDBQT format. These can be prepared
-by Meeko from SD files, or from Mol2 files, but SDF is strongly preferred.
-```console
-mk_prepare_ligand.py -i molecule.sdf -o molecule.pdbqt
-mk_prepare_ligand.py -i multi_mol.sdf --multimol_outdir folder_for_pdbqt_files
-```
-
-#### 2. convert docking results to SDF
-AutoDock-GPU and Vina write docking results in the PDBQT format. The DLG output
-from AutoDock-GPU contains docked poses in PDBQT blocks.
-Meeko generates RDKit molecules from PDBQT files (or strings) using the SMILES
-string in the REMARK lines. The REMARK lines also have the mapping of atom indices
-between SMILES and PDBQT. SD files with docked coordinates are written
-from RDKit molecules.
-
-```console
-mk_export.py molecule.pdbqt -o molecule.sdf
-mk_export.py vina_results.pdbqt -o vina_results.sdf
-mk_export.py autodock-gpu_results.dlg -o autodock-gpu_results.sdf
-```
-
-Making RDKit molecules from SMILES is safer than guessing bond orders
-from the coordinates, specially because the PDBQT lacks hydrogens bonded
-to carbon. As an example, consider the following conversion, in which
-OpenBabel adds an extra double bond, not because it has a bad algorithm,
-but because this is a nearly impossible task.
-```console
-$ obabel -:"C1C=CCO1" -o pdbqt --gen3d | obabel -i pdbqt -o smi
-[C]1=[C][C]=[C]O1
-```
## Python tutorial
-#### 1. making PDBQT strings for Vina or for AutoDock-GPU
-
-```python
-from meeko import MoleculePreparation
-from meeko import PDBQTWriterLegacy
-from rdkit import Chem
-
-input_molecule_file = "example/BACE_macrocycle/BACE_4.sdf"
-
-# there is one molecule in this SD file, this loop iterates just once
-for mol in Chem.SDMolSupplier(input_molecule_file, removeHs=False):
- preparator = MoleculePreparation()
- mol_setups = preparator.prepare(mol)
- for setup in mol_setups:
- setup.show() # optional
- pdbqt_string = PDBQTWriterLegacy.write_string(setup)
-```
-At this point, `pdbqt_string` can be written to a file for
-docking with AutoDock-GPU or Vina, or passed directly to Vina within Python
-using `set_ligand_from_string(pdbqt_string)`. For context, see
-[the docs on using Vina from Python](https://autodock-vina.readthedocs.io/en/latest/docking_python.html).
-
#### 2. RDKit molecule from docking results
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 00000000..d0c3cbf1
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS ?=
+SPHINXBUILD ?= sphinx-build
+SOURCEDIR = source
+BUILDDIR = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+ @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+ @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/make.bat b/docs/make.bat
new file mode 100644
index 00000000..16d190d4
--- /dev/null
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+ set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=source
+set BUILDDIR=build
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+ echo.
+ echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+ echo.installed, then set the SPHINXBUILD environment variable to point
+ echo.to the full path of the 'sphinx-build' executable. Alternatively you
+ echo.may add the Sphinx directory to PATH.
+ echo.
+ echo.If you don't have Sphinx installed, grab it from
+ echo.https://www.sphinx-doc.org/
+ exit /b 1
+)
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
diff --git a/docs/requirements.txt b/docs/requirements.txt
new file mode 100644
index 00000000..31c781b4
--- /dev/null
+++ b/docs/requirements.txt
@@ -0,0 +1 @@
+sphinx-book-theme
diff --git a/docs/source/cli_export_result.rst b/docs/source/cli_export_result.rst
new file mode 100644
index 00000000..641fbd21
--- /dev/null
+++ b/docs/source/cli_export_result.rst
@@ -0,0 +1,43 @@
+mk_export.py
+============
+
+Convert docking results to SDF
+------------------------------
+
+AutoDock-GPU and Vina write docking results in the PDBQT format. The DLG output
+from AutoDock-GPU contains docked poses in PDBQT blocks, plus additional information.
+Meeko generates RDKit molecules from PDBQT using the SMILES
+string in the REMARK lines. The REMARK lines also have the mapping of atom indices
+between SMILES and PDBQT. SD files with docked coordinates are written
+from RDKit molecules.
+
+.. code-block:: bash
+
+ mk_export.py molecule.pdbqt -o molecule.sdf
+ mk_export.py vina_results.pdbqt -o vina_results.sdf
+ mk_export.py autodock-gpu_results.dlg -o autodock-gpu_results.sdf
+
+Why this matters
+----------------
+
+Making RDKit molecules from SMILES is safer than guessing bond orders
+from the coordinates, specially because the PDBQT lacks hydrogens bonded
+to carbon. As an example, consider the following conversion, in which
+OpenBabel adds an extra double bond, not because it has a bad algorithm,
+but because this is a nearly impossible task.
+
+.. code-block:: bash
+
+ obabel -:"C1C=CCO1" -o pdbqt --gen3d | obabel -i pdbqt -o smi
+ [C]1=[C][C]=[C]O1
+
+
+Caveats
+-------
+
+If docking does not use explicit Hs, which it often does not, the
+exported positions of hydrogens are calculated from RDKit. This can
+be annoying if a careful forcefield minimization is employed before
+docking, as probably rigorous Hs positions will be replaced by the
+RDKit geometry rules, which are empirical and much simpler than most
+force fields.
diff --git a/docs/source/cli_lig_prep.rst b/docs/source/cli_lig_prep.rst
new file mode 100644
index 00000000..91fe3224
--- /dev/null
+++ b/docs/source/cli_lig_prep.rst
@@ -0,0 +1,16 @@
+mk_prepare_ligand.py
+====================
+
+Command line tool to prepare small organic molecules.
+
+Write PDBQT files
+-----------------
+
+AutoDock-GPU and Vina read molecules in the PDBQT format. These can be prepared
+by Meeko from SD files, or from Mol2 files, but SDF is strongly preferred.
+
+.. code-block:: bash
+
+ mk_prepare_ligand.py -i molecule.sdf -o molecule.pdbqt
+ mk_prepare_ligand.py -i multi_mol.sdf --multimol_outdir folder_for_pdbqt_files
+
diff --git a/docs/source/cli_rec_prep.rst b/docs/source/cli_rec_prep.rst
new file mode 100644
index 00000000..2fadd173
--- /dev/null
+++ b/docs/source/cli_rec_prep.rst
@@ -0,0 +1,104 @@
+The input structure is matched against templates to
+guarantee chemical correctness and identify problems with the input structures.
+This allows the user to identify and fix problems, resulting in a molecular
+model that is correct with respect to heavy atoms, protonation state,
+connectivity, bond orders, and formal charges.
+
+The matching algorithm uses the connectivity and elements, but not bond orders
+or atom names. Hydrogens are optional. This makes it compatible with input
+files from various sources.
+
+Templates are matched on a per residue basis. Each residue is represented
+as an instance of a PolymerResidue object, which contains:
+ - an RDKit molecule that represents the actual state
+ - a padded RDKit molecule containing a few atoms from the adjacent residues
+ - parameters such as partial charges
+
+The positions are set by the input, and the connectivity and formal charges
+are defined by the templates. Heavy atoms must match exactly. If heavy atoms
+are missing or in excess, the templates will fail to match.
+
+Missing hydrogens are added by RDKit, but are not subjected to minimization
+with a force field. Thus, their bond lengths are not super accurate.
+
+Different states of the same residue are stored as different templates,
+for example different protonation states of HIS, N-term, LYN/LYS, etc.
+Residue name is primary key unless user overrides.
+
+Currently not supported: capped residues from charmm-gui.
+
+mk_prepare_receptor
+===================
+
+Basic usage
+-----------
+
+.. code-block:: bash
+
+ mk_prepare_receptor -i examples/system.pdb --write_pdbqt prepared.pdbqt
+
+
+
+
+Protonation states
+------------------
+
+
+Adding templates
+----------------
+
+Write flags
+-----------
+
+The option flags starting with ``--write`` in ``mk_prepare_receptor`` can
+be used both with an argument to specify the outpuf filename:
+
+.. code-block:: bash
+
+ --write_pdbqt myenzyme.pdbqt --write_json myenzyme.json
+
+and without the filename argument as long as a default basename is provided:
+
+.. code-block:: bash
+
+ --output_basename myenzyme --write_pdbqt --write_json
+
+It is also possible to combine the two types of usage:
+
+.. code-block:: bash
+
+ --output_basename myenzyme
+ --write_pdbqt
+ --write_json
+ --write_vina_box box_for_myenzyme.txt
+
+in which case the specified filenames have priority over the default basename.
+
+.. _templates:
+
+Templates
+---------
+
+The templates contain SMILES strings that are used to create the RDKit
+molecules that constitute every residue in the processed model. In this way,
+the chemistry of the processed model is fully defined by the templates,
+and the only thing that is preserved from the input are the atom positions
+and the connectivity between residues.
+
+The SMILES strings contain all atoms that exist in the final model,
+and none that do not exist. This also applies to hydrogens,
+meaning that the SMILES are expected to have real hydrogens. Note that
+real hydrogens are different from explicit hydrogens. Real hydrogens will be
+represented as an actual atom in an RDKit molecule, while explicit hydrogens
+are a just property of heavy atoms. In the SMILES, real hydrogens are defined
+with square brackets "[H]" and explicit hydrogens without, e.g. "[nH]" to set
+the number of explicit hydrogens on an aromatic nitrogen to one.
+
+Residues that are part of a polymer, which is often all of them, will have
+bonds to adjacent residues. The heavy atoms involved in the bonds will miss
+a real hydrogen and have an implicit (or explicit) one instead. As an
+example, consider modeling an alkyl chain as a polymer, in which the monomer
+is a single carbon atom. Our template SMILES would be "[H]C[H]". The RDKit
+molecule will have three atoms and the carbon will have two implicit hydrogens.
+The implicit hydrogens correspond to bonds to adjacent residues in the
+processed polymer.
diff --git a/docs/source/colab_examples.rst b/docs/source/colab_examples.rst
new file mode 100644
index 00000000..b92554da
--- /dev/null
+++ b/docs/source/colab_examples.rst
@@ -0,0 +1,74 @@
+.. _colab_examples:
+
+Colab Examples
+============
+
+`Google Colaboratory `_ (Colab) is a cloud-based platform that allows users to write and execute Python codes through a browser. Regardless of the user's operating system, Colab provides Linux computing backends and some free GPU access.
+
+The following Colab examples are created to provide **an install-free experience** & **some generalizable workflows** of AutoDock Vina via Google Colab notebooks, which work in a similar manner to `Jupyter Notebooks `_, in the pre-configured environment with `Meeko `_ for receptor and ligand preparation, and other modules - `RDKit `_, `Scrubber `_, `ProDy `_, `reduce2 `_ (formerly `reduce `_), and `py3Dmol `_ - for conformer generation, manipulation & pre-processing of protein structures and visualization.
+
+**Subscription is NOT required to run these Colab examples.** Additionally, the input files for the docking calculations are either directly pulled from open databases or generated from user inputs. With that, one can easily customize the notebooks and reuse the workflow for similar calculations on different biochemical systems.
+
+Overview
+------------------------
+
+**General Workflow of Docking Calculations in Examples**
+
+.. image:: images/docking_workflow.png
+ :alt: docking workflow
+ :width: 100%
+ :align: center
+
+*Major Python packages used*
+
+* **RDKit** `https://rdkit.org/ `_
+* **Scrubber** `https://github.com/forlilab/scrubber `_
+* **Meeko** `https://github.com/forlilab/Meeko `_
+* **ProDy** `http://www.bahargroup.org/prody/ `_
+* **cctbx-base** (for reduce2) `https://github.com/cctbx/cctbx_project `_
+* **py3Dmol** `https://3dmol.org/ `_
+
+*Data*
+
+* **Phenix-project/geostd** (for reduce2) `https://github.com/phenix-project/geostd/ `_
+
+[Vina] Basic Docking
+------------------------
+
+`Run on Colab! `_
+
+The **basic docking example** is a rewrite based on the original basic docking example in the `Vina documentation `_. In this example, a small molecule ligand (Imatinib, PDB token `STI `_) is docked back to a hollow protein structure of mouse c-Abl (PDB token `1IEP `_) to reproduce the complex structure. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
+
+
+[Vina] Flexible Docking
+------
+
+`Run on Colab! `_
+
+The **flexible docking example** is a rewrite based on the original flexible docking example in the `Vina documentation `_. In this example, a variant of Imatinib (PDB token `PRC `_) is docked back to a hollow protein structure of mouse c-Abl (PDB token `1FPU `_) to reproduce the complex structure. Additionally, Thr315 is set to be a flexible residue. A docked pose that closely resembles the original position of the ligand and **a flipped Thr315** are expected among the top-ranked poses.
+
+
+[Vina] Basic Docking with an RNA Receptor
+---------------
+
+`Run on Colab! `_
+
+The basic docking example is developed after the **implementation of chemical templates for common nucleotides** in Meeko, which enables the preparation of RNA/DNA receptors. In this example, a small molecule inhibitor (Ribocil B, PDB token 51B) is docked back to a hollow protein structure of a bacteria FMN riboswitch (PDB token 5C45) to reproduce the complex structure.
+
+
+[Vina] Basic Docking with Cofactors
+---------------
+
+`Run on Colab! `_
+
+The basic docking example is developed to showcase the usage of **import additional chemical templates** into Meeko. In this example, a small molecule antibiotic (Kanamycin A, PDB token KAN) is docked back to a hollow protein structure of a bacteria aminoglycoside kinase APH(2)-Ia (PDB token 5IQB), together with two metal cofactor Magnesium (Mg2+) ions and the substrate phosphoaminophosphonic acid-guanylate ester (GMPPNP, PDB token GNP) to reproduce the complex structure.
+
+
+[AutoDock-GPU] Reactive Docking
+---------------
+
+`Run on Colab! `_
+
+The reactive docking example is based on reactive docking method that has been developed for high-throughput virtual screenings of reactive species. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is targeting at the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to generate the near-attack conformation for the formation of the phosphoamide bond. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
+
+
diff --git a/docs/source/conf.py b/docs/source/conf.py
new file mode 100644
index 00000000..af5c3d13
--- /dev/null
+++ b/docs/source/conf.py
@@ -0,0 +1,46 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+import os
+import sys
+sys.path.insert(0, os.path.abspath('../../meeko/'))
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+project = 'meeko'
+copyright = '2024, Forli Lab at Scripps Research'
+author = 'The Meeko authors'
+#release = '0.6.0'
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = [
+ 'sphinx.ext.autosectionlabel',
+ 'sphinx.ext.autodoc',
+ 'sphinx.ext.napoleon',
+ 'sphinx.ext.intersphinx',
+]
+
+html_logo = "images/raccoon.png"
+
+
+templates_path = ['_templates']
+exclude_patterns = []
+
+pygments_style = 'sphinx'
+
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = 'sphinx_book_theme'
+
+html_theme_options = {
+ 'show_toc_level': 2,
+ 'repository_url': 'https://github.com/forlilab/meeko',
+ 'use_repository_button': True, # add a "link to repository" button
+ 'navigation_with_keys': False,
+}
diff --git a/docs/source/images/annotated_HIE_AMP.png b/docs/source/images/annotated_HIE_AMP.png
new file mode 100644
index 00000000..c7218380
Binary files /dev/null and b/docs/source/images/annotated_HIE_AMP.png differ
diff --git a/docs/source/images/docking_workflow.png b/docs/source/images/docking_workflow.png
new file mode 100644
index 00000000..003a3a59
Binary files /dev/null and b/docs/source/images/docking_workflow.png differ
diff --git a/docs/source/images/embedded_CRO.png b/docs/source/images/embedded_CRO.png
new file mode 100644
index 00000000..430eb76c
Binary files /dev/null and b/docs/source/images/embedded_CRO.png differ
diff --git a/docs/source/images/highlighted_AMP.png b/docs/source/images/highlighted_AMP.png
new file mode 100644
index 00000000..b9a36221
Binary files /dev/null and b/docs/source/images/highlighted_AMP.png differ
diff --git a/docs/source/images/raccoon.png b/docs/source/images/raccoon.png
new file mode 100644
index 00000000..902853c2
Binary files /dev/null and b/docs/source/images/raccoon.png differ
diff --git a/docs/source/images/starting_CRO.png b/docs/source/images/starting_CRO.png
new file mode 100644
index 00000000..993440a9
Binary files /dev/null and b/docs/source/images/starting_CRO.png differ
diff --git a/docs/source/index.rst b/docs/source/index.rst
new file mode 100644
index 00000000..2027b60f
--- /dev/null
+++ b/docs/source/index.rst
@@ -0,0 +1,78 @@
+Meeko: interface for AutoDock
+=============================
+
+Parameterization of molecules
+-----------------------------
+
+Meeko assigns parameters to small organic molecules, referred to as ligands,
+and to proteins and nucleic acids, referred to as receptors.
+Parameterization includes assigning atom types, partial charges, setting
+bonds as rotatable, and making receptor sidechains flexible.
+
+Prepare input and convert output
+--------------------------------
+
+Meeko writes the input PDBQT files (or strings in Python) for AutoDock-Vina
+and AutoDock-GPU, and exports docking results in SDF format for ligands and
+in PDB format for receptor.
+
+Python API
+----------
+
+Meeko is written in Python and exposes functions and classes that operate on
+RDKit molecules for the ligands, leveraging RDKit's popularity to facilitate
+integration with external software. Command line scripts are also available.
+
+AutoDock ecosystem
+------------------
+
+To run a docking, more packages are required besides Meeko:
+
+ * AutoDock-Vina
+ * AutoDock-GPU
+ * Ringtail
+
+
+
+.. toctree::
+ :maxdepth: 3
+ :hidden:
+ :caption: Getting started
+
+ installation
+ colab_examples
+ tutorials
+
+.. toctree::
+ :maxdepth: 2
+ :hidden:
+ :caption: Ligand preparation
+
+ Overview
+ cli_lig_prep
+ In Python
+
+.. toctree::
+ :maxdepth: 2
+ :hidden:
+ :caption: Receptor preparation
+
+ cli_rec_prep
+
+.. toctree::
+ :maxdepth: 2
+ :hidden:
+ :caption: Exporting results
+
+ cli_export_result
+ In Python
+
+.. toctree::
+ :maxdepth: 2
+ :hidden:
+ :caption: Building residue templates
+
+ In Python
+
+
+
diff --git a/docs/source/installation.rst b/docs/source/installation.rst
new file mode 100644
index 00000000..28e44151
--- /dev/null
+++ b/docs/source/installation.rst
@@ -0,0 +1,62 @@
+Installation
+============
+
+We recommend using micromamba to manage Python environments and install Meeko.
+Other similar package managers also work, like mamba, conda, or miniconda.
+We prefer micromamba because it uses conda-forge as its default channel.
+If you use other package managers, please use the ``-c conda-forge`` option.
+
+To get micromamba, visit https://mamba.readthedocs.io/en/latest/
+
+
+From conda-forge
+----------------
+
+.. code-block:: bash
+
+ micromamba install meeko
+
+
+From PyPI
+------------------------
+
+.. code-block:: bash
+
+ pip install meeko
+
+If using micromamba or a similar package manager, ``pip`` installs the package
+in the active environment.
+
+
+From source
+-----------
+
+Here, we will checkout the ``develop`` branch, as it is likely more recent than the
+default ``release`` branch. Accessing features that are not in a release yet is one
+of the reasons to use the develop branch, which requires installing from source.
+
+.. code-block:: bash
+
+ git clone https://github.com/forlilab/Meeko.git
+ cd Meeko
+ git checkout develop
+ pip install .
+
+Alternatively, it is possible to install with ``pip install -e .``. Then, changes to
+the source files take immediate effect without requiring further ``pip install .``.
+This is useful for developers. Changes to the command line scripts may still require
+a re-installation.
+
+
+Support for Python 3.12
+-----------------------
+
+Meeko runs on Python 3.12 as long as Prody is not installed. To run on 3.12,
+install all dependencies except Prody and install Meeko from source.
+
+Meeko uses Prody to parse PDB and mmCIF files. Without prody, PDB files
+can be parsed with the command line option ``--read_pdb`` and with the Python
+method ``Polymer.from_pdb_string()``. However, without ProDy it
+won't be possible to read mmCIF files or use tethered docking. Prody developers
+are working to support Python 3.12, so it is possible that Prody will work
+on Python 3.12 soon.
diff --git a/docs/source/lig_overview.rst b/docs/source/lig_overview.rst
new file mode 100644
index 00000000..1fbd47ad
--- /dev/null
+++ b/docs/source/lig_overview.rst
@@ -0,0 +1,17 @@
+Overview of ligand preparation
+==============================
+
+Meeko takes as input an RDKit molecule that has 3D positions and
+all hydrogens as real atoms, and creates an object called MoleculeSetup
+that stores all the parameters, such as atom types, partial charges, or
+rotatable bonds. Adding hydrogens and 3D positions is not performed by Meeko.
+
+The MoleculePreparation class stores all the configuration required to
+parameterize a ligand, and containes the methods that take an RDKit molecule
+as input, and return a parameterized MoleculeSetup. The MoleculePreparation
+offers several option to control how molecules are parameterized, and many of
+which are exposed as command line options in ``mk_prepare_ligand.py``.
+
+PDBQT files, or strings in Python, are produced using a parameterized instance
+of MoleculeSetup. The methods for that live in the PDBQTWriterLegacy class.
+PDBQT strings can be passed directly to Vina using its Python API.
diff --git a/docs/source/py_build_temp.rst b/docs/source/py_build_temp.rst
new file mode 100644
index 00000000..079151ee
--- /dev/null
+++ b/docs/source/py_build_temp.rst
@@ -0,0 +1,281 @@
+.. _py_build_temp:
+
+Building residue templates in Python
+===================================
+
+The interpretation of the valence (bonds) and formal charge of atoms is an essential step when parsing a PDB/CIF file, and the accuracy of residue mapping is crucial to the creation of a macrobiomolecule system. In Meeko, the input residue names are used as keys and the chemical templates are retrieved accordingly based on :ref:`templates `.
+
+For the command line script for receptor preparatoin, ``mk_prepare_receptor.py``, there are three major ways of obtaining such templates:
+
+**(1) Loading from the default JSON file:** ``Meeko/meeko/data/residue_chem_templates.json``
+
+This is the default residue template set curated by us, including:
+
+(a) the standard residues of proteins, RNAs and DNAs,
+
+(b) the modified residues from the following Amber24 OFF library files:
+
+.. code-block:: bash
+
+ lib
+ ├── amino19.lib
+ ├── amino19ipq_0.9.lib
+ ├── aminoct19ipq_0.9.lib
+ ├── aminont19ipq_0.9.lib
+ ├── phosaa19SB.lib
+ ├── mod_amino19.lib
+ ├── RNA.lib
+ ├── terminalphos.LJbb-RNA.lib
+ ├── DNA.OL15.lib
+ ├── parmBSC1.lib
+ └── all_modrna08.lib
+
+(c) residues or ligands in CCD (Chemical Component Dictionary) that have conflicting names with the above residues.
+
+**(2) Loading by ``--add_template`` from an additional JSON file:** (example) ``Meeko/meeko/data/NAKB_templates.json``
+
+This is an optional add-on template set generated by us, based on the curated set of modified nucleotides by Nucleic Acid Knowledgebase (NAKB).
+
+**(3) Fetching from PDB by CCD name on the run**
+
+When an unknown residue is encountered, ``mk_prepare_receptor.py`` attempts to resolve its chemical identity by fetching a definition CIF file from PDB (Protein Data Bank) and generates chemical templates of all possible embedding forms of it when there are inter-residue bonds. Currently, this is an automated yet relatively simple process that only supports noncovalent ligands and residues with unmodified backbones.
+
+Here, we present a quick guide of building potentially complicated residue templates on your own using the ``meeko.chemtempgen`` submodule. In this example, we will be working with residue ``CRO``, a naturally occuring fluorophore in green fluorescent proteins formed by condensation of three consecutive residues Ser-Tyr-Gly.
+
+Example usage
+-------------
+
+Before we start, we will import the required modules and optionally, suppress excess rdkit loggings that may occur during the editing of molecular structures. Then we will create a ``ChemicalComponent`` from a definition CIF file, which will be obtained by ``fetch_from_pdb`` (Internet connection is required).
+
+.. code-block:: python
+
+ from meeko.chemtempgen import *
+ from rdkit import Chem
+ from rdkit.Chem import Draw
+ from rdkit import RDLogger
+ from PIL import Image
+ import io
+ import copy
+ import logging
+ import sys
+
+ rdkit_logger = RDLogger.logger()
+ rdkit_logger.setLevel(RDLogger.CRITICAL)
+
+ # Create a chemical component from the definition CIF file
+ basename = "CRO"
+ CRO_from_cif = ChemicalComponent.from_cif(fetch_from_pdb(basename), basename)
+
+The created ``ChemicalComponent`` object, ``CRO_from_cif``, has a corresponding RDKit molecule where the atom names from the definition CIF file are stored under the ``"atom_id"`` property per atom. For a quick check, you may draw the RDKit molecule with noted atom names:
+
+.. code-block:: python
+
+ def draw_cc_mol(cc_mol: Chem.Mol):
+ # Label atoms by atom name
+ for atom in cc_mol.GetAtoms():
+ atom.SetProp("atomNote", atom.GetProp("atom_id"))
+
+ # Draw the molecule
+ drawer = Draw.MolDraw2DCairo(600, 600)
+ drawer.DrawMolecule(cc_mol)
+ drawer.FinishDrawing()
+
+ # Get the image as PNG
+ png_data = drawer.GetDrawingText()
+ img = Image.open(io.BytesIO(png_data))
+ img.show()
+
+ draw_cc_mol(CRO_from_cif.rdkit_mol)
+
+.. image:: images/starting_CRO.png
+ :alt: starting CRO
+ :width: 60%
+ :align: center
+
+As we may see from the picture above, in order to forge ``CRO`` into a linking embedded fragment in a protein, some atoms need to be removed. In this example, we will simply do so by specifying the atom names. ``make_embedded`` calls function ``embed`` on the duplicated object ``cc``, which takes ``embed_allowed_smarts`` as the editable zone and removes atoms matching the names in ``leaving_names``. Here, the ``embed_allowed_smarts`` is chosen to be the SMARTS of altered backbone in residue ``CRO``. Note that by default, ``embed`` removes associated hydrogens for convenience. Therefore, in this case, ``leaving_names = {"H2", "OXT"}`` removes atoms ``H2``, ``OXT`` as well as the bonded hydrogen, ``HXT``. The same task could be alternatively done by the equivalent SMARTS pattern.
+
+.. code-block:: python
+
+ cc = copy.deepcopy(CRO_from_cif)
+
+ embed_allowed_smarts = "[NX2][CX4][CX3][NX3][CX4][CX3](=O)[OX2]"
+ cc = cc.make_embedded(allowed_smarts = embed_allowed_smarts, leaving_names = {"H2", "OXT"})
+
+ draw_cc_mol(cc.rdkit_mol)
+
+.. image:: images/embedded_CRO.png
+ :alt: embedded CRO
+ :width: 60%
+ :align: center
+
+Looking at the structure of the edited picture, we will see that the unneccessary atoms have gone and the hydrogens at the broken (blunt) ends become implict, which is exactly needed to generate the Smiles string for the chemical template. Function ``make_pretty_smiles`` makes the Smiles string with all Hs explicit for the template's RDKit molecule. Last but not least, we will determin the ``link_labels`` which specifies how ``CRO`` should be connected to other residues. Here, we will use the pattern from a built-in recipe, ``AA_recipe.pattern_to_label_mapping_standard``, which also applies to all other standard amino acid residues: ``{'[NX3h1]': 'N-term', '[CX3h1]': 'C-term'}``. Opionally, we can run a ``ResidueTemplate_check`` to see potential problems with the generated template.
+
+.. code-block:: python
+
+ cc = (
+ cc
+ .make_pretty_smiles()
+ .make_link_labels_from_patterns(pattern_to_label_mapping = AA_recipe.pattern_to_label_mapping_standard)
+ )
+ cc.ResidueTemplate_check()
+ export_chem_templates_to_json([cc])
+
+``export_chem_templates_to_json`` returns a JSON string of the residue template, with the corresponding content printed to console:
+
+.. code-block:: bash
+
+ ******************** New Template Built ********************
+ {
+ "ambiguous": {
+ "CRO": ["CRO"]
+ },
+ "residue_templates": {
+ "CRO": {
+ "smiles": "[H]NC([H])(C1=NC(=C([H])C2=C([H])C([H])=C(O[H])C([H])=C2[H])C(=O)N1C([H])([H])C=O)C([H])(O[H])C([H])([H])[H]",
+ "atom_name": ["H", "N1", "CA1", "HA1", "C1", "N2", "CA2", "CB2", "HB2", "CG2", "CD1", "HD1", "CE1", "HE1", "CZ", "OH", "HOH", "CE2", "HE2", "CD2", "HD2", "C2", "O2", "N3", "CA3", "HA31", "HA32", "C3", "O3", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13"],
+ "link_labels": {"1": "N-term", "27": "C-term"}
+ }
+ }
+ }
+ ************************************************************
+
+You may now wonder: What if the residue locates at the C- or N-terminal of the protein? Although this is not common for ``CRO``, we will go with it for demonstration purposes.
+
+To make the N-terminal embedding variant of ``CRO``:
+
+.. code-block:: python
+
+ # Duplicate and start over from the original chemical component
+ cc_N = copy.deepcopy(CRO_from_cif)
+
+ cc_N = (
+ cc_N
+ # Remove atom OXT
+ .make_embedded(allowed_smarts = embed_allowed_smarts, leaving_names = {"OXT"})
+ # Cap (protonate) atom N
+ .make_capped(allowed_smarts = embed_allowed_smarts, capping_names = {"N1"}, protonate = True)
+ # (Re)generate Smiles with all Hs explicit
+ .make_pretty_smiles()
+ # Find linker atoms
+ .make_link_labels_from_patterns(pattern_to_label_mapping = AA_recipe.pattern_to_label_mapping_standard)
+ )
+
+ cc_N.ResidueTemplate_check()
+ # In case there are already residue templates with the same parent (original) residue name
+ cc_N.resname += "_N"
+ export_chem_templates_to_json([cc_N])
+
+In the chained procedure above, we have removed ``OXT`` and protonated ``N1``, which is done by ``make_capped`` that adds hydrogen(s) to matching atom(s) with specified ``capping_names`` within the region of ``allowed_smarts``. The expected outout from ``export_chem_templates_to_json`` is:
+
+.. code-block:: bash
+
+ Atom # 0 (N1) in mol doesn't have implicit Hs -> continue with next atom...
+ Molecule doesn't contain wanted_smarts: [NX3h1] -> continue with next pattern...
+ Molecule doesn't contain pattern: [NX3h1] -> linker label for N-term will not be made.
+ ******************** New Template Built ********************
+ {
+ "ambiguous": {
+ "CRO": ["CRO_N"]
+ },
+ "residue_templates": {
+ "CRO": {
+ "smiles": "[H]OC1=C([H])C([H])=C(C([H])=C2N=C(C([H])(N([H])[H])C([H])(O[H])C([H])([H])[H])N(C([H])([H])C=O)C2=O)C([H])=C1[H]",
+ "atom_name": ["HOH", "OH", "CZ", "CE1", "HE1", "CD1", "HD1", "CG2", "CB2", "HB2", "CA2", "N2", "C1", "CA1", "HA1", "N1", "H", "H2", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13", "N3", "CA3", "HA31", "HA32", "C3", "O3", "C2", "O2", "CD2", "HD2", "CE2", "HE2"],
+ "link_labels": {"30": "C-term"}
+ }
+ }
+ }
+ ************************************************************
+
+To make the C-terminal embedding variant of ``CRO``:
+
+.. code-block:: python
+
+ # Duplicate and start over from the original chemical component
+ cc_C = copy.deepcopy(CRO_from_cif)
+
+ cc_C = (
+ cc_C
+ # Deprotonate the carboxylate group
+ .make_canonical(acidic_proton_loc = {'[H][O][C](=O)': 0})
+ # Remove atom H2
+ .make_embedded(allowed_smarts = embed_allowed_smarts, leaving_names = {"H2"})
+ # (Re)generate Smiles with all Hs explicit
+ .make_pretty_smiles()
+ # Find linker atoms
+ .make_link_labels_from_patterns(pattern_to_label_mapping = AA_recipe.pattern_to_label_mapping_standard)
+ )
+
+ cc_C.ResidueTemplate_check()
+ # In case there are already residue templates with the same parent (original) residue name
+ cc_C.resname += "_C"
+ export_chem_templates_to_json([cc_C])
+
+In the chained procedure above, we have deprotonated the carboxylate group(s) and removed ``H2``. The deprotonation is done by ``make_canonical`` that deprotonates all protons specified by ``acidic_proton_loc``, which includes a SMARTS pattern and the index of the proton. ``chemtempgen.py`` also includes a constant ``acidic_proton_loc_canonical``, which is potentially useful as a universal protocol to deprotonate the acidic protons to get the canonical protonation state at near physiological pH.
+
+.. code-block:: python
+
+ # Constants for deprotonate
+ acidic_proton_loc_canonical = {
+ # any carboxylic acid, sulfuric/sulfonic acid/ester, phosphoric/phosphinic acid/ester
+ '[H][O]['+atom+'](=O)': 0 for atom in ('CX3', 'SX4', 'SX3', 'PX4', 'PX3')
+ } | {
+ # any thio carboxylic/sulfuric acid
+ '[H][O]['+atom+'](=S)': 0 for atom in ('CX3', 'SX4')
+ } | {
+ '[H][SX2][a]': 0, # thiophenol
+ }
+
+The expected output is:
+
+.. code-block:: bash
+
+ Molecule doesn't contain wanted_smarts: [CX3h1] -> continue with next pattern...
+ Molecule doesn't contain pattern: [CX3h1] -> linker label for C-term will not be made.
+ ******************** New Template Built ********************
+ {
+ "ambiguous": {
+ "CRO": ["CRO_C"]
+ },
+ "residue_templates": {
+ "CRO_C": {
+ "smiles": "[H]NC([H])(C1=NC(=C([H])C2=C([H])C([H])=C(O[H])C([H])=C2[H])C(=O)N1C([H])([H])C(=O)[O-])C([H])(O[H])C([H])([H])[H]",
+ "atom_name": ["H", "N1", "CA1", "HA1", "C1", "N2", "CA2", "CB2", "HB2", "CG2", "CD1", "HD1", "CE1", "HE1", "CZ", "OH", "HOH", "CE2", "HE2", "CD2", "HD2", "C2", "O2", "N3", "CA3", "HA31", "HA32", "C3", "O3", "OXT", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13"],
+ "link_labels": {"1": "N-term"}
+ }
+ }
+ }
+ ************************************************************
+
+If you have generated ``cc``, ``cc_N``, and ``cc_C``, you may write them all into one JSON file:
+
+.. code-block:: python
+
+ export_chem_templates_to_json([cc, cc_N, cc_C], json_fname = "CRO_templates.json")
+
+And below is the content of ``CRO_templates.json``, which can be loaded by ``mk_prepare_receptor --add_templates CRO_templates.json`` during receptor preparation:
+
+.. code-block:: bash
+
+ {
+ "ambiguous": {
+ "CRO": ["CRO", "CRO_N", "CRO_C"]
+ },
+ "residue_templates": {
+ "CRO": {
+ "smiles": "[H]NC([H])(C1=NC(=C([H])C2=C([H])C([H])=C(O[H])C([H])=C2[H])C(=O)N1C([H])([H])C=O)C([H])(O[H])C([H])([H])[H]",
+ "atom_name": ["H", "N1", "CA1", "HA1", "C1", "N2", "CA2", "CB2", "HB2", "CG2", "CD1", "HD1", "CE1", "HE1", "CZ", "OH", "HOH", "CE2", "HE2", "CD2", "HD2", "C2", "O2", "N3", "CA3", "HA31", "HA32", "C3", "O3", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13"],
+ "link_labels": {"1": "N-term", "27": "C-term"}
+ },
+ "CRO_N": {
+ "smiles": "[H]OC1=C([H])C([H])=C(C([H])=C2N=C(C([H])(N([H])[H])C([H])(O[H])C([H])([H])[H])N(C([H])([H])C=O)C2=O)C([H])=C1[H]",
+ "atom_name": ["HOH", "OH", "CZ", "CE1", "HE1", "CD1", "HD1", "CG2", "CB2", "HB2", "CA2", "N2", "C1", "CA1", "HA1", "N1", "H", "H2", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13", "N3", "CA3", "HA31", "HA32", "C3", "O3", "C2", "O2", "CD2", "HD2", "CE2", "HE2"],
+ "link_labels": {"30": "C-term"}
+ },
+ "CRO_C": {
+ "smiles": "[H]NC([H])(C1=NC(=C([H])C2=C([H])C([H])=C(O[H])C([H])=C2[H])C(=O)N1C([H])([H])C(=O)[O-])C([H])(O[H])C([H])([H])[H]",
+ "atom_name": ["H", "N1", "CA1", "HA1", "C1", "N2", "CA2", "CB2", "HB2", "CG2", "CD1", "HD1", "CE1", "HE1", "CZ", "OH", "HOH", "CE2", "HE2", "CD2", "HD2", "C2", "O2", "N3", "CA3", "HA31", "HA32", "C3", "O3", "OXT", "CB1", "HB1", "OG1", "HOG1", "CG1", "HG11", "HG12", "HG13"],
+ "link_labels": {"1": "N-term"}
+ }
+ }
+ }
\ No newline at end of file
diff --git a/docs/source/py_export_result.rst b/docs/source/py_export_result.rst
new file mode 100644
index 00000000..f6e55da4
--- /dev/null
+++ b/docs/source/py_export_result.rst
@@ -0,0 +1,37 @@
+Exporting docking results in Python
+===================================
+
+.. code-block:: python
+
+ from meeko import PDBQTMolecule
+ from meeko import RDKitMolCreate
+
+ fn = "autodock-gpu_results.dlg"
+ pdbqt_mol = PDBQTMolecule.from_file(fn, is_dlg=True, skip_typing=True)
+ rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
+
+The length of ``rdkitmol_list`` is one if there are no sidechains and only one
+ligand was docked.
+If multiple ligands and/or sidechains are docked simultaneously, each will be
+an individual RDKit molecule in ``rdkitmol_list``.
+Sidechains are truncated at the C-alpha.
+Note that docking multiple
+ligands simultaneously is only available in Vina, and it differs from docking
+multiple ligands one after the other. Each failed creation of an RDKit molecule
+for a ligand or sidechain results in a ``None`` in ``rdkitmol_list``.
+
+For Vina's output PDBQT files, omit ``is_dlg=True``.
+
+.. code-block:: python
+
+ pdbqt_mol = PDBQTMolecule.from_file("vina_results.pdbqt", skip_typing=True)
+ rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
+
+When using Vina from Python, the output string can be passed directly.
+See `the docs `_ for context on the ``Vina`` object.
+
+.. code-block:: python
+
+ vina_output_string = v.poses()
+ pdbqt_mol = PDBQTMolecule(vina_output_string, is_dlg=True, skip_typing=True)
+ rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)
diff --git a/docs/source/py_lig_prep.rst b/docs/source/py_lig_prep.rst
new file mode 100644
index 00000000..16c3c367
--- /dev/null
+++ b/docs/source/py_lig_prep.rst
@@ -0,0 +1,32 @@
+Ligand preparation in Python
+============================
+
+
+Creating a PDBQT string from an RDKit molecule
+----------------------------------------------
+.. code-block:: python
+
+ from meeko import MoleculePreparation
+ from meeko import PDBQTWriterLegacy
+ from rdkit import Chem
+
+ input_molecule_file = "example/BACE_macrocycle/BACE_4.sdf"
+
+ # there is one molecule in this SD file, this loop iterates just once
+ for mol in Chem.SDMolSupplier(input_molecule_file, removeHs=False):
+ preparator = MoleculePreparation()
+ mol_setups = preparator.prepare(mol)
+ for setup in mol_setups:
+ setup.show() # optional
+ pdbqt_string = PDBQTWriterLegacy.write_string(setup)
+
+At this point, ``pdbqt_string`` can be written to a file for
+docking with AutoDock-GPU or Vina, or passed directly to Vina within Python
+using ``set_ligand_from_string (pdbqt_string)``. For context, see `the docs on using Vina from Python `_.
+
+One advantage of this approach is that input PDBQT files are not written to the filesystem.
+The PDBQT format is lossy, because it lacks bond orders and non-polar hydrogens,
+making it a poor choice to store molecular information.
+
+Another advantage is to write custom workflows entirely from Python without external
+calls to ``mk_prepare_ligand.py``.
diff --git a/docs/source/tutorial1.rst b/docs/source/tutorial1.rst
new file mode 100644
index 00000000..50acc810
--- /dev/null
+++ b/docs/source/tutorial1.rst
@@ -0,0 +1,347 @@
+.. _tutorial1:
+
+Basic Docking
+-------------------------------------
+
+This tutorial provides practice examples and a step-by-step guide for the two basic procedures, **Ligand Preparation** and **Receptor Preparation**, with Meeko for molecular docking and virtual screening with `AutoDock Vina `_ and `AutoDock-GPU `_. It is based on, but not a full version of the tutorial materials in `Forlilab tutorials `_.
+
+.. contents::
+ :local:
+ :depth: 2
+
+Prerequisites and Environment Setup
+===================================
+
+Create a new virtual environment (recommended)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: bash
+
+ micromamba create -c conda-forge -n meeko_tutorial_py39 python=3.9 -y
+ micromamba activate meeko_tutorial_py39
+
+In this tutorial, we will use ``micromamba`` as the example package manager. Visit `this official guide `_ for a quick install and setup of micromamba. There are many equivalent ways to manage Python packages, such as ``conda`` and ``mamba``. You can easily adapt the commands to your preferred tool, as the syntax is largely compatible across these package managers.
+
+Install the required Python packages through ``conda-forge``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: bash
+
+ micromamba install -c conda-forge numpy scipy rdkit gemmi vina -y
+
+Install the additional packages and data from GitHub repositories
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- (Python package) Meeko
+
+.. code-block:: bash
+
+ git clone --single-branch --branch develop https://github.com/forlilab/Meeko.git
+ cd Meeko; pip install --use-pep517 -e .; cd ..
+
+- (Python package) scrubber
+
+.. code-block:: bash
+
+ git clone --single-branch --branch develop https://github.com/forlilab/scrubber.git
+ cd scrubber; pip install --use-pep517 -e .; cd ..
+
+- (Example files for this tutorial) Forlilab Tutorials
+
+.. code-block:: bash
+
+ git clone https://github.com/forlilab/tutorials.git
+
+Ligand Preparation
+==================
+
+Ligand Preparation is the process that generates ligand input files for docking calculation and virtual screening. At present, AutoDock Vina and AutoDock-GPU need the ligand input files in the PDBQT format. In this example, we will use ``mk_prepare_ligand.py``, a command-line script in Meeko, to prepare such ligand PDBQT files.
+
+Prepare a Single Ligand from a Smiles String
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+`Imatinib `_ is a small-molecule drug. You can find the SMILES string for Imatinib from various reliable chemical databases and resources, including but not limited to `PubChem `_ and `DrugBank `_.
+
+``scrub.py`` is a command-line script in Scrubber that generates 3D conformers of protomers and tautomers for given small molecules at a specified (range of) pH. Given a pH range of 5 to 9, the output protomers will include those which make up no less than 1% of the total population at pH = 7. Based on the reference pKa values, the amine nitrogens and the pyridine nitrogen will be considered for acid/base enumeration. With the ``meeko_tutorial_py39`` micromamba environment active, run ``scrub.py`` to generate 3D conformers of Imatinib from the SMILES string.
+
+.. code-block:: bash
+
+ smiles_string="CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5"
+ scrub.py $smiles_string -o imatinib.sdf --skip_tautomers --ph_low 5 --ph_high 9
+
+The output file ``imatinib.sdf`` will contain two protomers of Imatinib, one with a neutral pyridine group and the other with a (+1) pyridinium group. All of the aliphatic amininium nitrogens will be protonated.
+
+.. code-block:: bash
+
+ Scrub completed.
+ Summary of what happened:
+ Input molecules supplied: 1
+ mols processed: 1, skipped by rdkit: 0, failed: 0
+ nr isomers (tautomers and acid/base conjugates): 2 (avg. 2.000 per mol)
+ nr conformers: 2 (avg. 1.000 per isomer, 2.000 per mol)
+
+In case there are multiple molecules in the SDF file, ``mk_prepare_ligand.py`` needs to know the prefix of filenames (by ``--multimol_prefix``) or alternatively where to output (by ``--multimol_outdir``) the multiple PDBQT files. Here, we will give the PDBQT files a prefix ``imatinib_protomer`` in the names. The output PDBQT files will be ``imatinib_protomer-1.pdbqt`` and ``imatinib_protomer-2.pdbqt``.
+
+.. code-block:: bash
+
+ mk_prepare_ligand.py -i imatinib.sdf --multimol_prefix imatinib_protomer
+
+
+Prepare Ligands in Batch from a ``.smi`` File
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In preparation for virtual screening, it is possible to prepare ligands in batch from a ``.smi`` File. There is one such example file at ``tutorials/imatinib/step-4/mols.smi`` from `Forlilab tutorials `_. Follow the example commands to process ``mols.smi``:
+
+.. code-block:: bash
+
+ smi_file="tutorials/imatinib/step-4/mols.smi"
+ scrub.py $smi_file -o mols.sdf
+
+At the end of the execution, the expected standard output will tell you the total number of isomers written to the multi-molecule SDF file ``mols.sdf``. This will help you estimate the expected file size and system requirements beforehand.
+
+.. code-block:: bash
+
+ Scrub completed.
+ Summary of what happened:
+ Input molecules supplied: 491
+ mols processed: 491, skipped by rdkit: 0, failed: 0
+ nr isomers (tautomers and acid/base conjugates): 741 (avg. 1.509 per mol)
+ nr conformers: 741 (avg. 1.000 per isomer, 1.509 per mol)
+
+For ``mols.sdf``, we will run ``mk_prepare_ligand.py`` with ``--multimol_prefix mols_pdbqt``, a directory to be created to hold the ligand PDBQT files. If you expect a large number of isomers (potentially millions), consider writing to a temporary directory or scratch space to manage storage efficiently.
+
+.. code-block:: bash
+
+ mk_prepare_ligand.py -i mols.sdf --multimol_outdir mols_pdbqt
+
+Receptor Preparation
+====================
+
+Receptor Preparation is the process that generates receptor input files for docking calculation and virtual screening. It typically begins with a PDB file of a biomacromolecule system, with or without coordinates of explicit hydrogens. At present, AutoDock Vina and AutoDock-GPU may require different types of files as receptor inputs. ``mk_prepare_receptor.py`` is the command-line script in Meeko that is designed to handle the different situations.
+
+For AutoDock-Vina
+~~~~~~~~~~~~~~~~~
+
+Docking with AutoDock-Vina requires the following receptor input files:
+
+- Receptor PDBQT file
+- (Optional) a TXT file that contains the box specifications, which can be re-used as the config file for Vina
+
+Starting from a provided PDB file at ``tutorials/imatinib/step-3/1iep_protein.pdb`` from `Forlilab tutorials `_, the generation of a Receptor PDBQT file is very straightforward:
+
+.. code-block:: bash
+
+ pdb_file="tutorials/imatinib/step-3/1iep_protein.pdb"
+ mk_prepare_receptor.py --read_pdb $pdb_file -o rec_1iep -p
+
+Here, we use ``-o`` to set the basename of the output files to ``rec_1iep`` with request ``-p``. The execution will generate only the receptor PDBQT file, ``rec_1iep.pdbqt``.
+
+Note that ``--read_pdb``, which uses the PDB parser in RDKit, is not the only way for ``mk_prepare_receptor.py`` to parse a receptor PDB file. The alternate is ``-i`` (short for ``--read_with_prody``) and it requires ProDy as an additional dependency. If you wish to use the ProDy parser, run ``pip install prody`` to install ProDy.
+
+To generate the TXT file that has the box dimension, we must find a way to define the wanted docking box. In this example, we will use a provided PDB file of ligand Imatinib at ``tutorials/imatinib/step-3/xray-imatinib.pdb`` that has been aligned to the expected binding site of the provided receptor PDB file.
+
+.. code-block:: bash
+
+ pdb_file="tutorials/imatinib/step-3/1iep_protein.pdb"
+ lig_file="tutorials/imatinib/step-3/xray-imatinib.pdb"
+ mk_prepare_receptor.py --read_pdb $pdb_file -o rec_1iep -p -v \
+ --box_enveloping $lig_file --padding 5
+
+Here, we add the ``-v`` to request the Vina-style box files to be generated along with the receptor PDBQT files. To define the box, we are using the combination of ``--box_enveloping`` and ``--padding``, which is to sete the center of the box by the given object, and the size of the box by a constant padding in each dimension around the given object. Note that this is not the only way to define the box. Read the help message printed from ``mk_prepare_receptor.py -h`` to learn about other combinations.
+
+At the end of the execution with ``-p -v``, the expected standard output will be:
+
+.. code-block:: bash
+
+ Files written:
+ rec_1iep.pdbqt <-- static (i.e., rigid) receptor input file
+ rec_1iep.box.txt <-- Vina-style box dimension file
+ rec_1iep.box.pdb <-- PDB file to visualize the grid box
+
+.. _receptor_preparation_for_vina_with_adf4sf:
+
+For AutoDock-Vina (and with AutoDock4 Scoring Function)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To use the AutoDock4 Scoring Function in AutoDock-Vina, an additional step needs to be taken to compute the grid maps prior to the docking calculation. At present, this is only possible with AutoGrid, and therefore needs a Grid Parameter File (GPF) for it. Using ``mk_prepare_receptor.py`` option ``-g``, such GPF file can be generated in the same step as the receptor PDBQT file as well as the box dimension files. Here's an example:
+
+.. code-block:: bash
+
+ pdb_file="tutorials/imatinib/step-3/1iep_protein.pdb"
+ lig_file="tutorials/imatinib/step-3/xray-imatinib.pdb"
+ mk_prepare_receptor.py --read_pdb $pdb_file -o rec_1iep -p -v -g \
+ --box_enveloping $lig_file --padding 5
+
+At the end of the execution with ``-p -v -g``, the expected standard output is now:
+
+.. code-block:: bash
+
+ Files written:
+ rec_1iep.pdbqt <-- static (i.e., rigid) receptor input file
+ boron-silicon-atom_par.dat <-- atomic parameters for B and Si (for autogrid)
+ rec_1iep.gpf <-- autogrid input file
+ rec_1iep.box.txt <-- Vina-style box dimension file
+ rec_1iep.box.pdb <-- PDB file to visualize the grid box
+
+To compute the grid maps, the GPF file (``rec_1iep.gpf``) will be the input command file for AutoGrid. The receptor PDBQT file (``rec_1iep.pdbqt``) and the additional parameter file (``boron-silicon-atom_par.dat``) need to be in the same directory from which AutoGrid is run.
+
+For AutoDock-GPU
+~~~~~~~~~~~~~~~~
+
+At present, AutoDock-GPU also needs the pre-computed grid maps from AutoGrid. Therefore, Receptor Preparation for docking calculations with AutoDock-GPU is similar to preparation in the previous section :ref:`receptor_preparation_for_vina_with_adf4sf`. But in this case, we can drop the ``-v`` option as the Vina-style box definition TXT file is no longer needed for AutoGrid-GPU.
+
+Below is the sample command:
+
+.. code-block:: bash
+
+ pdb_file="tutorials/imatinib/step-3/1iep_protein.pdb"
+ lig_file="tutorials/imatinib/step-3/xray-imatinib.pdb"
+ mk_prepare_receptor.py --read_pdb $pdb_file -o rec_1iep -p -g \
+ --box_enveloping $lig_file --padding 5
+
+And the expected standard output will be:
+
+.. code-block:: bash
+
+ Files written:
+ rec_1iep.pdbqt <-- static (i.e., rigid) receptor input file
+ boron-silicon-atom_par.dat <-- atomic parameters for B and Si (for autogrid)
+ rec_1iep.gpf <-- autogrid input file
+ rec_1iep.box.pdb <-- PDB file to visualize the grid box
+
+Save a Receptor JSON File for Docking with Flexible and/or Reactive Residues
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Docking with flexible and/or reactive residues may require more files than basic docking, and ``mk_prepare_receptor.py`` is able to prepare those simultaneously when creating the receptor PDBQT file. The detailed procedure for Reactive Docking can be found in :ref:`tutorial2`. Here, we will use a different PDB file at ``tutorials/imatinib/step-3/2hzn_protein.pdb`` to showcase a simple docking preparation with flexible sidechains:
+
+.. code-block:: bash
+
+ pdb_file="tutorials/imatinib/step-3/2hzn_protein.pdb"
+ lig_file="tutorials/imatinib/step-3/xray-imatinib.pdb"
+ mk_prepare_receptor.py --read_pdb $pdb_file -o rec_2hzn -p -v -g -j \
+ --box_enveloping $lig_file --padding 5 \
+ -f A:286,359 --allow_bad_res
+
+Note that several additional arguments are introduced for this particular receptor structure and for flexible docking. First and for most, ``-f A:286,359`` specifies that we are making two residues flexible, which are Glu286 and Phe359 in chain A of the receptor PDB file ``2hzn_protein.pdb``. Moreover, we add the ``--allow_bad_res`` so that partially resolved residues in the input PDB file can be ignored. Finally, we make the request ``-j`` to not only write the typical input files for docking calculations, but also a receptor JSON file. This receptor JSON file may be used in future steps in order to export the full receptor structure with updated sidechain conformations from the docking output.
+
+With that, the standard output and the list of generated files from ``mk_prepare_receptor.py`` will be:
+
+.. code-block:: bash
+
+ - Template matching failed for: ['A:238', 'A:262', 'A:263', 'A:264', 'A:281', 'A:356', 'A:462', 'A:466', 'A:502'] Ignored due to allow_bad_res.
+
+ Flexible residues:
+ chain resnum is_reactive reactive_atom
+ A 359 False
+ A 286 False
+ reactive_flexres=set()
+
+ Files written:
+ rec_2hzn.json <-- parameterized receptor
+ rec_2hzn_flex.pdbqt <-- flexible receptor input file
+ rec_2hzn_rigid.pdbqt <-- static (i.e., rigid) receptor input file
+ boron-silicon-atom_par.dat <-- atomic parameters for B and Si (for autogrid)
+ rec_2hzn_rigid.gpf <-- autogrid input file
+ rec_2hzn.box.txt <-- Vina-style box dimension file
+ rec_2hzn.box.pdb <-- PDB file to visualize the grid box
+
+Export Poses from Docking
+=========================
+
+From AutoDock-Vina
+~~~~~~~~~~~~~~~~~~
+
+With AutoDock-Vina, The required files (generated from the previous steps) and the command to run a basic docking calculation of a single ligand is as follows:
+
+.. code-block:: bash
+
+ lig_pdbqt="imatinib_protomer-1.pdbqt"
+ rec_pdbqt="rec_1iep.pdbqt"
+ config_txt="rec_1iep.box.txt"
+ ./vina --ligand $lig_pdbqt --receptor $rec_pdbqt --config $config_txt
+
+Without giving Vina a custom output name, the default output PDBQT file will be named ``imatinib_protomer-1_out.pdbqt``. Using the Smiles and mapping information stored in the REMARKS section of the PDBQT file, ``mk_export.py`` is able to reconstruct the all-atom structures of the docked ligand and export the poses to a SDF file, ``imatinib_protomer-1_vina_out.sdf``, which includes the reconstructed coordinates of all hydrogen atoms:
+
+.. code-block:: bash
+
+ docked_pdbqt="imatinib_protomer-1_out.pdbqt"
+ mk_export.py $docked_pdbqt -s imatinib_protomer-1_vina_out.sdf
+
+From AutoDock-GPU
+~~~~~~~~~~~~~~~~~
+
+With AutoDock-GPU, the required files (generated from the previous steps) and the command to run a basic docking calculation of a single ligand is as follows:
+
+.. code-block:: bash
+
+ lig_name="imatinib_protomer-1"
+ lig_pdbqt="${lig_name}.pdbqt"
+ rec_prefix="rec_1iep"
+ rec_map_fld="${rec_prefix}.maps.fld"
+ ./adgpu --lfile $lig_pdbqt --ffile $rec_map_fld --resnam $lig_name
+
+With that, the output DLG file will be named ``imatinib_protomer-1.dlg``. Similarly, ``mk_export.py`` is able to reconstruct the atomistic structures of the docked ligand and export the poses to a SDF file as follows:
+
+.. code-block:: bash
+
+ docked_dlg="imatinib_protomer-1.dlg"
+ mk_export.py $docked_dlg -s imatinib_protomer-1_adgpu_out.sdf
+
+Note that by default, only the cluster leads will be exported to the SDF file. To export all generated poses in the DLG file, add the ``--all_dlg_poses`` option when exporting the poses.
+
+From Flexible Receptor Docking (with AutoDock-Vina)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+With AutoDock-Vina, the required files (generated from the previous steps) and the command to run a flexible docking calculation of a single ligand is as follows:
+
+.. code-block:: bash
+
+ lig_name="imatinib_protomer-1"
+ lig_pdbqt="${lig_name}.pdbqt"
+ rec_prefix="rec_2hzn"
+ flexres_pdbqt="${rec_prefix}_flex.pdbqt"
+ rec_pdbqt="${rec_prefix}_rigid.pdbqt"
+ config_txt="${rec_prefix}.box.txt"
+ ./vina --ligand $lig_pdbqt --flex $flexres_pdbqt --receptor $rec_pdbqt --config $config_txt --out ${lig_name}_flexres.pdbqt
+
+With that, the output PDBQT file will be named ``imatinib_protomer-1_flexres.pdbqt``. If given the receptor JSON file (``rec_2hzn.json``) generated when the other receptor files were created, ``mk_export.py`` is able to reconstruct the atomistic structures of the full receptor and export the updated models to a multi-model PDB file (``imatinib_protomer-1_flexres_vina_out.pdb``) with the following command:
+
+.. code-block:: bash
+
+ rec_json="rec_2hzn.json"
+ docked_pdbqt="imatinib_protomer-1_flexres.pdbqt"
+ mk_export.py $docked_pdbqt -j $rec_json -p imatinib_protomer-1_flexres_vina_out.pdb
+
+From Flexible Receptor Docking (with AD-GPU)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+With AutoDock-GPU, the required files (generated from the previous steps) and the command to run a flexible docking calculation of a single ligand is as follows:
+
+.. code-block:: bash
+
+ lig_name="imatinib_protomer-1"
+ lig_pdbqt="${lig_name}.pdbqt"
+ rec_prefix="rec_2hzn"
+ flexres_pdbqt="${rec_prefix}_flex.pdbqt"
+ rec_map_fld="${rec_prefix}_rigid.maps.fld"
+ ./adgpu --lfile $lig_pdbqt --flexres $flexres_pdbqt --ffile $rec_map_fld --resnam ${lig_name}_flexres
+
+With that, the output DLG file will be named ``imatinib_protomer-1_flexres.dlg``. Again, if given the receptor JSON file (``rec_2hzn.json``) generated when the other receptor files were created, ``mk_export.py`` is able to export the updated models to a PDB file (``imatinib_protomer-1_flexres_adgpu_out.pdb``):
+
+.. code-block:: bash
+
+ rec_json="rec_2hzn.json"
+ docked_dlg="imatinib_protomer-1_flexres.dlg"
+ mk_export.py $docked_dlg -j $rec_json -p imatinib_protomer-1_flexres_adgpu_out.pdb
+
+At present, all docking poses will be exported, whether they are cluster leads or not.
+
+Processing the Screening (Batch Docking) Results
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To process results from Screening (Batch Docking), please use the `Ringtail `_ package for SQL-based data management, streamlined analysis and filtering. The documentation of Ringtail can be found `here `_.
+
+What's Next?
+^^^^^^^^^^^^
+
+Now that you've completed this tutorial, you're ready to move on to :ref:`tutorial2` and :ref:`tutorial3` where we dive deeper into more advanced docking methods: reactive docking and tethered docking.
diff --git a/docs/source/tutorial2.rst b/docs/source/tutorial2.rst
new file mode 100644
index 00000000..b65e6e7a
--- /dev/null
+++ b/docs/source/tutorial2.rst
@@ -0,0 +1,310 @@
+.. _tutorial2:
+
+=========================
+Reactive Docking
+=========================
+
+This is a reactive docking example that uses the AutoDock-GPU executable to generate the near-attack conformation of a small molecule and a protein receptor.
+
+Follow the instructions to set up the environment and run this command-line example on your own device (Linux, MacOS or WSL). To run this example in a Colab notebook, see :ref:`colab_examples`.
+
+.. contents::
+ :local:
+ :depth: 2
+
+Introduction
+============
+
+The reactive docking example is based on reactive docking method that has been developed for high-throughput virtual screenings of reactive species. This method is currently only implemented in AutoDock-GPU. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is targeting at the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to generate the near-attack conformation for the formation of the phosphoamide bond. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
+
+This tutorial is intended to showcase the Meeko usage in the preparation of receptor and ligand for reactive docking.
+
+.. _env_setup_like_colab:
+
+Prerequisites and Environment Setup
+===================================
+
+Create a new virtual environment (recommended)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: bash
+
+ micromamba create -c conda-forge -n meeko_tutorial python=3.10 -y
+ micromamba activate meeko_tutorial
+
+In this tutorial, we will use ``micromamba`` as the example package manager. Visit `this official guide `_ for a quick install and setup of micromamba. There are many equivalent ways to manage Python packages, such as ``conda`` and ``mamba``. You can easily adapt the commands to your preferred tool, as the syntax is largely compatible across these package managers.
+
+Install the required Python packages through ``conda-forge``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: bash
+
+ micromamba install -c conda-forge cctbx-base numpy scipy rdkit gemmi -y
+
+Install the additional packages and data from GitHub repositories
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- (Python package) Meeko
+
+.. code-block:: bash
+
+ git clone --single-branch --branch develop https://github.com/forlilab/Meeko.git
+ cd Meeko; pip install --use-pep517 -e .; cd ..
+
+- (Python package) scrubber
+
+.. code-block:: bash
+
+ git clone --single-branch --branch develop https://github.com/forlilab/scrubber.git
+ cd scrubber; pip install --use-pep517 -e .; cd ..
+
+- (Python package) ProDy
+
+.. code-block:: bash
+
+ pip install prody
+
+- (Required data package for reduce2) Phenix Project geostd (restraint) Library
+
+.. code-block:: bash
+
+ git clone https://github.com/phenix-project/geostd.git
+
+
+Ligand Preparation
+=================
+
+In this step, the ligand molecule is prepared from a Smiles string. A protonated 3D conformer of ligand is generated by ``scrub.py``, and the conversion to a tangible ligand PDBQT file is done by ``mk_prepare_ligand.py``.
+
+.. image:: images/highlighted_AMP.png
+ :alt: highlighted AMP
+ :width: 60%
+ :align: center
+
+The ligand of this example is AMP (adenosine monophosphate). We will use its isomeric Smiles string as the input, and manually write the phosphate group in the -2 charge state. ``scrub.py`` will generate an SDF file, ``AMP.sdf``, containing a 3D conformer of AMP (2-) with all explicit hydrogens.
+
+.. code-block:: bash
+
+ ligand_smiles="c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)COP(=O)([O-])[O-])O)O)N"
+ scrub.py $ligand_smiles -o AMP.sdf --skip_tautomer --skip_acidbase
+
+To prepare AMP (2-) as an reactive ligand, we specify the reactive phosphoryl atom by Smarts string ``COP(=O)([O-])[O-]`` and the index number ``3``. With ``AMP.sdf`` as the input file, the command-line script ``mk_prepare_ligand.py`` looks for matches of the Smarts string ``reactive_smarts`` in the input molecule structure, making the ith atom in the match a reactive atom based on the 1-based index number ``reactive_smarts_idx``.
+
+.. code-block:: bash
+
+ reactive_smarts="COP(=O)([O-])[O-]"
+ reactive_smarts_idx=3
+ mk_prepare_ligand.py -i AMP.sdf -o AMP.pdbqt \
+ --reactive_smarts $reactive_smarts \
+ --reactive_smarts_idx $reactive_smarts_idx
+
+The generated ligand PDBQT file, ``AMP.pdbqt``, will contain special AutoDock atom types for the reactive docking. The reactive atom types encode the atom type as well as the adjacency to the reactive atom. In this example: ``P1`` denotes the reactive phosphorus atom (with order number = 1). ``O5`` denotes the neighbor ``OA`` atoms (with order number = 2). Because the original atom type (``OA``) contains 2 letters, an additional increment of +3 is applied to the number suffix. And finally ``C3`` denotes the further ``C`` type atom (aliphatic carbon, with order number = 3).
+
+.. code-block:: bash
+
+ REMARK SMILES Nc1ncnc2c1ncn2[C@@H]1O[C@H](COP(=O)([O-])[O-])[C@@H](O)[C@H]1O
+ REMARK SMILES IDX 11 1 22 2 20 3 13 4 12 5 10 6 4 7 3 8 5 9 2 10 6 11 7 12
+ REMARK SMILES IDX 8 13 9 14 1 15 23 18 21 20 14 22 15 23 16 24 17 25 18 26
+ REMARK SMILES IDX 19 27
+ REMARK H PARENT 1 16 1 17 23 19 21 21
+ ROOT
+ ATOM 1 C UNL 1 0.091 -0.756 0.545 1.00 0.00 0.253 C
+ ATOM 2 C UNL 1 0.369 -1.495 -0.773 1.00 0.00 0.195 C
+ ATOM 3 C UNL 1 1.476 -0.670 -1.387 1.00 0.00 0.179 C
+ ATOM 4 C UNL 1 2.147 -0.047 -0.164 1.00 0.00 0.178 C
+ ATOM 5 O UNL 1 1.163 0.124 0.832 1.00 0.00 -0.347 OA
+ ENDROOT
+ BRANCH 1 6
+ ATOM 6 N UNL 1 -1.156 0.007 0.449 1.00 0.00 -0.285 N
+ ATOM 7 C UNL 1 -4.095 -1.786 1.544 1.00 0.00 0.226 A
+ ATOM 8 N UNL 1 -5.018 -0.824 1.216 1.00 0.00 -0.217 NA
+ ATOM 9 N UNL 1 -2.764 -1.577 1.316 1.00 0.00 -0.216 NA
+ ATOM 10 C UNL 1 -4.639 0.363 0.654 1.00 0.00 0.155 A
+ ATOM 11 C UNL 1 -2.392 -0.395 0.766 1.00 0.00 0.167 A
+ ATOM 12 C UNL 1 -3.282 0.566 0.430 1.00 0.00 0.150 A
+ ATOM 13 N UNL 1 -2.654 1.617 -0.131 1.00 0.00 -0.231 NA
+ ATOM 14 C UNL 1 -1.348 1.232 -0.108 1.00 0.00 0.204 A
+ BRANCH 10 15
+ ATOM 15 N UNL 1 -5.614 1.348 0.310 1.00 0.00 -0.382 N
+ ATOM 16 H UNL 1 -5.332 2.257 -0.120 1.00 0.00 0.158 HD
+ ATOM 17 H UNL 1 -6.627 1.168 0.488 1.00 0.00 0.158 HD
+ ENDBRANCH 10 15
+ ENDBRANCH 1 6
+ BRANCH 2 18
+ ATOM 18 O UNL 1 0.753 -2.832 -0.545 1.00 0.00 -0.386 OA
+ ATOM 19 H UNL 1 1.495 -2.835 0.115 1.00 0.00 0.211 HD
+ ENDBRANCH 2 18
+ BRANCH 3 20
+ ATOM 20 O UNL 1 2.354 -1.419 -2.197 1.00 0.00 -0.387 OA
+ ATOM 21 H UNL 1 2.901 -2.009 -1.617 1.00 0.00 0.211 HD
+ ENDBRANCH 3 20
+ BRANCH 4 22
+ ATOM 22 C UNL 1 2.798 1.302 -0.496 1.00 0.00 0.201 C3
+ BRANCH 22 23
+ ATOM 23 O UNL 1 3.411 1.842 0.657 1.00 0.00 -0.348 O5
+ BRANCH 23 24
+ ATOM 24 P UNL 1 5.100 1.600 0.586 1.00 0.00 0.060 P1
+ ATOM 25 O UNL 1 5.699 2.493 -0.477 1.00 0.00 -0.326 O5
+ ATOM 26 O UNL 1 5.775 1.996 2.085 1.00 0.00 -0.790 O5
+ ATOM 27 O UNL 1 5.459 -0.015 0.231 1.00 0.00 -0.790 O5
+ ENDBRANCH 23 24
+ ENDBRANCH 22 23
+ ENDBRANCH 4 22
+ TORSDOF 7
+
+Receptor Preparation
+===================
+
+The preparation of a rigid receptor consists of two steps. The receptor structure is first sourced from a PDB file and sent to ``reduce2.py`` for hydrogen addition and optimization, and then, the conversion to a tangible receptor PDBQT file is done by ``mk_prepare_receptor.py``.
+
+In this example, we begin from retrieving the PDB structure by token ``3kgd`` from RCSB PDB.
+
+.. code-block:: bash
+
+ pdb_token="3kgd"
+ curl "http://files.rcsb.org/view/${pdb_token}.pdb" -o "${pdb_token}.pdb"
+
+Next, we will run a Python script to write ProDy selection ``chain A and not water and not hetero and not resname AMP`` to a PDB file ``3kgd_receptor_atoms.pdb``.
+
+.. code-block:: python
+
+ python - < "${pdb_token}_receptor.pdb"
+
+In this example, we use ``reduce2.py`` to add hydrogen atoms to the receptor structure and optimize the positions. There are various other tools (``H++``, ``APBS``, etc.) of choices for this task. It should also be noted that ``mk_prepare_receptor.py`` does not neccessarily need the presence of all hydrogens in the input receptor structure – The missing hydrogens will be added through RDKit functions during the receptor preparation with ProDy selection ``chain A and not water and not hetero and not resname AMP`` to a PDB file ``3kgd_receptor_atoms.pdb``.
+
+.. code-block:: bash
+
+ # setting up reduce2 for the first time in the environment
+ reduce2="$(python -c "import site; print(site.getsitepackages()[0])")/mmtbx/command_line/reduce2.py"
+ chmod +x $reduce2
+ geostd="$(realpath geostd)"
+ export MMTBX_CCP4_MONOMER_LIB=$geostd
+
+ # running reduce2 on the example receptor PDB
+ reduce_opts="approach=add add_flip_movers=True"
+ python $reduce2 "${pdb_token}_receptor.pdb" $reduce_opts
+
+After running the last command above, ``reduce2.py`` will conclude a normal execution with a log file ``3kgd_receptorH.txt`` and a protonated receptor structure file ``3kgd_receptorH.pdb`` – The PDB file can then be fed to ``mk_prepare_receptor.py`` to generate the receptor PDBQT file. But before that, we could (optionally) save the original position of residue AMP and use it to define the grid box for docking. To do this, we will use ProDy selection ``chain A and resname AMP`` to write a PDB file ``LIG.pdb``
+
+.. code-block:: python
+
+ python - < 2510 atoms and 1 coordinate set(s) were parsed in 0.01s.
+
+ Flexible residues:
+ chain resnum is_reactive reactive_atom
+ A 309 True NE2
+ reactive_flexres={'A:309'}
+
+ For reactive docking, pass the configuration file to AutoDock-GPU:
+ autodock_gpu -C 1 --import_dpf 3kgd_receptorH.reactive_config --flexres 3kgd_receptorH_flex.pdbqt -L
+
+
+ Files written:
+ 3kgd_receptorH_flex.pdbqt <-- flexible receptor input file
+ 3kgd_receptorH_rigid.pdbqt <-- static (i.e., rigid) receptor input file
+ boron-silicon-atom_par.dat <-- atomic parameters for B and Si (for autogrid)
+ 3kgd_receptorH_rigid.gpf <-- autogrid input file
+ 3kgd_receptorH.box.pdb <-- PDB file to visualize the grid box
+ 3kgd_receptorH.reactive_config <-- reactive parameters for AutoDock-GPU
+
+The expected ``3kgd_receptorH_flex.pdbqt`` contains the reactive flexible residue, His309. Note that for the receptor residues, the reactive atom types may include a number prefix as an identifier to distinguish among possible multiple reactive residues.
+
+.. code-block:: bash
+
+ BEGIN_RES HIS A 309
+ REMARK INDEX MAP 3 1 15 2 18 3 19 4 20 5 21 6 22 7 25 8
+ ROOT
+ ATOM 1 CA HIS A 309 -1.221 -40.602 -5.650 1.00 0.00 0.177 C
+ ENDROOT
+ BRANCH 1 2
+ ATOM 2 CB HIS A 309 -2.472 -39.882 -5.156 1.00 0.00 0.093 C
+ BRANCH 2 3
+ ATOM 3 CG HIS A 309 -3.505 -40.770 -4.538 1.00 0.00 0.061 1A3
+ ATOM 4 ND1 HIS A 309 -3.678 -42.083 -4.910 1.00 0.00 -0.242 1N6
+ ATOM 5 CD2 HIS A 309 -4.442 -40.512 -3.593 1.00 0.00 0.107 1A2
+ ATOM 6 CE1 HIS A 309 -4.660 -42.611 -4.192 1.00 0.00 0.196 1A2
+ ATOM 7 NE2 HIS A 309 -5.152 -41.670 -3.401 1.00 0.00 -0.350 1N1
+ ATOM 8 HE2 HIS A 309 -5.940 -41.788 -2.748 1.00 0.00 0.167 1H5
+ ENDBRANCH 2 3
+ ENDBRANCH 1 2
+ END_RES HIS A 309
+
+Docking Calculation
+===================
+
+The reactive docking method is only implemented in AutoDock-GPU, which also requires grid map computation with AutoGrid4 before the docking calculation.
+
+The previously generated GPF file (``3kgd_receptorH_rigid.gpf``), together with the PDBQT file of the rigid part of the receptor (``3kgd_receptorH_rigid.pdbqt``), will be used to compute the grid maps:
+
+.. code-block:: bash
+
+ ./autogrid4 -p 3kgd_receptorH_rigid.gpf
+
+And to run the docking calculation, the ligand PDBQT file (``AMP.pdbqt``), the flexible residue PDBQT file (``3kgd_receptorH_flex.pdbqt``), the special docking parameter file (DPF) for reactive docking (``3kgd_receptorH.reactive_config``), and the map files will be needed. With the following command for docking calculation, the output file will have basename ``AMP``.
+
+.. code-block:: bash
+
+ ./adgpu --lfile AMP.pdbqt --flexres 3kgd_receptorH_flex.pdbqt \
+ --ffile 3kgd_receptorH_rigid.maps.fld --import_dpf 3kgd_receptorH.reactive_config \
+ --resnam AMP
+
+.. _T4_executables:
+
+If you're running these calculations on Google T4 backends, here are the pre-compiled executables of autogrid4 and adgpu specifically for T4:
+
+- autodock-gpu v1.5.3
+`autodock_gpu_128wi `_
+`adgpu_analysis `_
+
+- autogrid v4.2.6
+`autogrid4 `_
+
+Export the Docking Poses
+========================
+
+``mk_export.py`` is a command-line script in Meeko to export docking poses from PDBQT or DLG formats. For this example, if we want to export the ligand docking poses to a (possibly multi-model) SDF file with fully explicit hydrogens:
+
+.. code-block:: bash
+
+ mk_export.py AMP.dlg -s 3kgd_AMP_adgpu_out.sdf
diff --git a/docs/source/tutorial3.rst b/docs/source/tutorial3.rst
new file mode 100644
index 00000000..ce4c602e
--- /dev/null
+++ b/docs/source/tutorial3.rst
@@ -0,0 +1,200 @@
+.. _tutorial3:
+
+=========================
+Tethered Docking
+=========================
+
+This is a tethered (two-point attached covalent) docking example that uses the AutoDock-GPU executable to reproduce a covalent complex of a small molecule and a protein receptor.
+
+Follow the instructions to set up the environment and run this command-line example on your own device (Linux, MacOS or WSL). To run this example in a Colab notebook, see :ref:`colab_examples`.
+
+.. contents::
+ :local:
+ :depth: 2
+
+Introduction
+============
+
+The covalent docking example is based on the two-point attractor and flexible sidechain method. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is attached onto the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to reproduce the covalent intermediate complex structure. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
+
+This tutorial is intended to showcase the Meeko usage in the preparation of receptor and ligand for tethered docking.
+
+Prerequisites and Environment Setup
+===================================
+
+Follow the instructions in the previous tutorial for environment setup: :ref:`env_setup_like_colab`.
+
+Ligand Conformer Generation
+===========================
+
+In this step, the ligand molecule is prepared from a Smiles string. A protonated 3D conformer of the covalent residue-ligand conjugate will be generated by ``scrub.py``.
+
+For tethered docking, the ligand needs to a covalent conjugate of ligand and a residue that include the following components:
+
+- the part of ligand present in the covalent conjuage, flexible in docking.
+- the part of residue sidechain present in the covalent conjugate, flexible in docking.
+- the residue's C-alpha and C-beta, which work as the two-point attractors and their positions remain unchanged in docking. The other backbone atoms are not required.
+
+.. image:: images/annotated_HIE_AMP.png
+ :alt: highlighted AMP
+ :width: 60%
+ :align: center
+
+The ligand of this example will be the covalent conjugate ``HIE_AMP``, where AMP is attached to the Nε atom of a histidine residue via forms a phosphoamide bond. We will use its isomeric Smiles string as the input, and manually write the phosphoester group in the -1 charge state. ``scrub.py`` will generate an SDF file, ``HIE_AMP.sdf``, containing a 3D conformer of HIE-AMP (1-) with all explicit hydrogens.
+
+.. code-block:: bash
+
+ ligand_smiles="c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)COP(=O)([O-])N1C=C(CC)N=C1)O)O)N"
+ scrub.py $ligand_smiles -o HIE_AMP.sdf --skip_tautomer --skip_acidbase
+
+To prepare HIE-AMP (1-) as an covalent flexible residue, we will hold this SDF file for further mapping with the specific catalytic residue in receptor structure. In fact, the SDF file can be re-used for different Histidine residues in different receptor structures.
+
+Receptor Preparation
+===================
+
+The preparation of a rigid receptor consists of two steps. The receptor structure is first sourced from a PDB file and sent to ``reduce2.py`` for hydrogen addition and optimization, and then, the conversion to a tangible receptor PDBQT file is done by ``mk_prepare_receptor.py``.
+
+The first step (hydrogen addition and optimization) of this example will be the same as the reactive docking tutorial :ref:`tutorial2`. You may skip the steps and proceed to the next ``mk_prepare_receptor.py`` code block if you already have ``3kgd_receptorH.pdb`` or an equivalent (a protonated receptor PDB file). If not, we will begin from retrieving the PDB structure by token ``3kgd`` from RCSB PDB.
+
+.. code-block:: bash
+
+ pdb_token="3kgd"
+ curl "http://files.rcsb.org/view/${pdb_token}.pdb" -o "${pdb_token}.pdb"
+
+Next, we will run a Python script to write ProDy selection ``chain A and not water and not hetero and not resname AMP`` to a PDB file ``3kgd_receptor_atoms.pdb``.
+
+.. code-block:: python
+
+ python - < "${pdb_token}_receptor.pdb"
+
+In this example, we use ``reduce2.py`` to add hydrogen atoms to the receptor structure and optimize the positions. There are various other tools (``H++``, ``APBS``, etc.) of choices for this task. It should also be noted that ``mk_prepare_receptor.py`` does not neccessarily need the presence of all hydrogens in the input receptor structure – The missing hydrogens will be added through RDKit functions during the receptor preparation with ProDy selection ``chain A and not water and not hetero and not resname AMP`` to a PDB file ``3kgd_receptor_atoms.pdb``.
+
+.. code-block:: bash
+
+ # setting up reduce2 for the first time in the environment
+ reduce2="$(python -c "import site; print(site.getsitepackages()[0])")/mmtbx/command_line/reduce2.py"
+ chmod +x $reduce2
+ geostd="$(realpath geostd)"
+ export MMTBX_CCP4_MONOMER_LIB=$geostd
+
+ # running reduce2 on the example receptor PDB
+ reduce_opts="approach=add add_flip_movers=True"
+ python $reduce2 "${pdb_token}_receptor.pdb" $reduce_opts
+
+After running the last command above, ``reduce2.py`` will conclude a normal execution with a log file ``3kgd_receptorH.txt`` and a protonated receptor structure file ``3kgd_receptorH.pdb`` – The PDB file can then be fed to ``mk_prepare_receptor.py`` to generate the receptor PDBQT file. But before that, we could (optionally) save the original position of residue AMP and use it to define the grid box for docking. To do this, we will use ProDy selection ``chain A and resname AMP`` to write a PDB file ``LIG.pdb``
+
+.. code-block:: python
+
+ python - < PDB file is found in working directory (3kgd.pdb).
+ @> 11804 atoms and 1 coordinate set(s) were parsed in 0.14s.
+ @> 5062 atoms and 1 coordinate set(s) were parsed in 0.05s.
+
+ Flexible residues:
+ chain resnum is_reactive reactive_atom
+ A 309 False
+ reactive_flexres=set()
+
+ Files written:
+ 3kgd_receptorH_flex.pdbqt <-- flexible receptor input file
+ 3kgd_receptorH_rigid.pdbqt <-- static (i.e., rigid) receptor input file
+ boron-silicon-atom_par.dat <-- atomic parameters for B and Si (for autogrid)
+ 3kgd_receptorH_rigid.gpf <-- autogrid input file
+ 3kgd_receptorH.box.pdb <-- PDB file to visualize the grid box
+
+Covalent Ligand Preparation
+========================
+
+In this step, we will use mk_prepare_ligand.py to generate the PDBQT file for the covalent ligand. Along with the previously generated 3D conformer of the covalent ligand (``HIE_AMP.sdf``), which may be at an arbitrary position, here a reference protein PDB file (``3kgd_receptor.pdb``) will be used to source the positions of the attractor atoms, Cα and Cβ, to keep them unchanged in docking. The reference PDB file does not have to be the full receptor, but it must contain the target residue that matches exactly with ``rec_residue``. Additionally, a SMARTS pattern ``tether_smarts`` is required. Together with the 1-based ``tether_smarts_indices``, they are used to locate the attractor atoms that correspond to Cα and Cβ of a hisitidine (His) residue:
+
+.. code-block:: bash
+
+ rec_residue="A:HIS:309"
+ tether_smarts="n1cc(CC)nc1"
+ tether_smarts_indices="5 4"
+ mk_prepare_ligand.py -i HIE_AMP.sdf --receptor 3kgd_receptor.pdb --rec_residue $rec_residue \
+ --tether_smarts "${tether_smarts}" --tether_smarts_indices $tether_smarts_indices \
+ -o HIE_AMP.pdbqt
+
+Docking Calculation
+===================
+
+The tethered docking method needs an empty ligand file, which is allowed in AutoDock-GPU that requires grid map computation with AutoGrid4 before the docking calculation.
+
+The previously generated GPF file (``3kgd_receptorH_rigid.gpf``), together with the PDBQT file of the rigid part of the receptor (``3kgd_receptorH_rigid.pdbqt``), will be used to compute the grid maps:
+
+.. code-block:: bash
+
+ ./autogrid4 -p 3kgd_receptorH_rigid.gpf
+
+And to run the docking calculation, the ligand PDBQT file (``HIE_AMP.pdbqt``) and the map files will be needed. Please note that althogh the commands and the filenames look the same, the GPF from the reactive docking tutorial :ref:`tutorial2` has additional parameters for reactive docking, and therefore the maps cannot be reused in this example. Also, we will be passing ``HIE_AMP.pdbqt`` as a flexible residue file instead of a ligand file to AD-GPU. In this docking calculation, the ligand will be ``empty`` which is indeed an empty file. With the following command for docking calculation, the output file will have basename ``HIE_AMP``.
+
+.. code-block:: bash
+
+ touch empty
+ ./adgpu --lfile empty --flexres HIE_AMP.pdbqt \
+ --ffile 3kgd_receptorH_rigid.maps.fld \
+ --resnam HIE_AMP
+
+If you're running these calculations on Google T4 backends, here are the pre-compiled executables of autogrid4 and adgpu specifically for T4: :ref:`T4_executables`
+
+Export the Docking Poses
+========================
+
+``mk_export.py`` is a command-line script in Meeko to export docking poses from PDBQT or DLG formats. For this example, if we want to export the docking poses to a (possibly multi-model) SDF file with fully explicit hydrogens, we can use the ``-k`` option to keep the covalent ligand which is treated as a flexible residue:
+
+.. code-block:: bash
+
+ mk_export.py HIE_AMP.dlg -s 3kgd_HIE_AMP_adgpu_out.sdf -k
+
+It is also possible to export the docking poses to a multi-model PDB file with updated conformatons of His309 and the covalently bonded AMP. To do this, we need a receptor JSON file that could be generated with ``mk_prepare_receptor.py`` option ``-j`` during the receptor preparation.
+
+.. code-block:: bash
+
+ # rerun as needed to generate the receptor json file
+ flexres="A:309"
+ mk_prepare_receptor.py -i "${pdb_token}_receptorH.pdb" -o "${pdb_token}_receptorH" -j \
+ --default_altloc A -f $flexres \
+ --box_enveloping "LIG.pdb" --padding 8.0
+
+ mk_export.py HIE_AMP.dlg -s 3kgd_HIE_AMP_adgpu_out.sdf -k 3kgd_receptorH.json -p 3kgd_HIE_AMP_adgpu_out.pdb
\ No newline at end of file
diff --git a/docs/source/tutorials.rst b/docs/source/tutorials.rst
new file mode 100644
index 00000000..4e55a0db
--- /dev/null
+++ b/docs/source/tutorials.rst
@@ -0,0 +1,13 @@
+Tutorials
+=========
+
+These tutorials cover the use of meeko in integration with other packages,
+including vina, autodock-gpu, ringtail, and molscrub.
+
+.. toctree::
+ :maxdepth: 2
+ :caption: Tutorials
+
+ tutorial1
+ tutorial2
+ tutorial3