Merge pull request #170 from OpenFreeEnergy/cli_tutorial_proofreading

CLI tutorial proofreading
OpenFreeEnergy · Oct 18, 2024 · b73a698 · b73a698
2 parents 34fad3e + 50d3b52
commit b73a698
Showing 1 changed file with 63 additions and 56 deletions.
diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md
@@ -1,63 +1,66 @@
 # Relative Free Energies with the OpenFE CLI
 
-This tutorial will show how to use the OpenFE command line interface to get
-free energies -- with no Python at all! This will work for simple setups, but you
-may need to use the Python interface for more complicated setups.
+This tutorial will show how to use the OpenFE CLI (Command Line Interface) to calculate
+free energies - with no Python at all! This CLI works for simple setups, but you
+may need to use the Python API for more complicated setups.
 
 The entire process of running the campaign of simulations is split into 3
 stages, each of which corresponds to a CLI command:
 
-1. Setting up the necessary files to describe each of the individual
-   simulations to run
+1. Setting up the files necessary to run each of the simulations
 2. Running the simulations
-3. Gathering the results of separate simulations into a single table
+3. Gathering the results of the simulations into a single table
 
-To work through this tutorial, start out with a fresh directory. You can download the tutorial materials (including this file) using the command:
+To work through this tutorial, start out with a fresh directory. You can download the tutorial materials (including these instructions) using the command:
 
 ```bash
 openfe fetch rbfe-tutorial
 ```
 
-Then when you run `ls`, you should see that your directory has this file,
-`cli_tutorial.md`, a notebook called `python_tutorial.ipynb`, and files with
-the molecules we'll use in this tutorial: `tyk2_ligands.sdf` and
-`tyk2_protein.pdb`.
+Then when you run `ls`, you should see that your directory has:
+
+- `cli_tutorial.md`: the file containing these instructions
+- `python_tutorial.ipynb`: a notebook detailing how to do this analysis using the Python API, instead of the CLI shown here.
+- `tyk2_ligands.sdf` and `tyk2_protein.pdb` : files containing the molecules we'll use in this tutorial.
 
 ## Setting up the campaign
 
-The CLI makes setting up the simulation very easy -- it's just a single CLI
+The CLI makes setting up the simulation very easy - it's just a single CLI
 command. There are separate commands for relative binding free energy (RBFE)
 and relative hydration free energy setups (RHFE).
 
 For RBFE campaigns, the relevant command is `openfe plan-rbfe-network`. For
 RHFE, the command is `openfe plan-rhfe-network`. They work mostly the same,
 except that the RHFE planner does not take a protein. In this tutorial, we'll
-do an RBFE calculation. The only difference for RBFE is in the setup stage --
+do an RBFE calculation. The only difference for RHFE is in the setup stage -
 running the simulations and gathering the results are the same.
 
-To run the command, we do the following:
-  * Read all the ligands from the SDF by giving
-    the option `-M tyk2_ligands.sdf`. You can also use `-M` with a directory, and
-    it will load all molecules found in any SDF or MOL2 file in that directory.
-  * Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`.
-  * Instruct `openfe` to output files into a directory called `network_setup`
-    with the `-o network_setup` option.
+With the single command:
 
 ```bash
 openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup
 ```
 
+we do the following:
+
+- Read all the ligands from the SDF by giving
+    the option `-M tyk2_ligands.sdf`. You can also use `-M` with a directory, and
+    it will load all molecules found in any SDF or MOL2 file in that directory.
+- Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`.
+- Instruct `openfe` to output files into a directory called `network_setup`
+    with the `-o network_setup` option.
+
 Planning the campaign may take some time, as it tries to find the best
 network from all possible transformations. This will create a directory called
-`network_setup`, which is structured like this:
+`network_setup/`, which is structured like this:
 
 <!-- top lines from `tree network_setup` -->
 
 ```text
 network_setup
 ├── ligand_network.graphml
 ├── network_setup.json
-└── transformations
+└── transformations/
     ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json
     ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json
     ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_47_complex.json
@@ -79,16 +82,18 @@ This opens an interactive viewer. You can move the ligand names around to get a
 better view of the structure, and if you click on the edge, you will see the
 mapping for that edge.
 
-The files that describe each individual simulation we will run are located in the
-`transformations` subdirectory. Each JSON file represents a single alchemical
-leg to run, and contains all the necessary information to run that leg. A
-single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files):
-one for the ligand in solvent, and one for the ligand complexed with the
-protein. The results from these two simulations can then be combined to obtained a single $\Delta\Delta G$ relative binding free energy value. Filenames indicate ligand names as taken from the SDF; for example,
-the file `easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg
+The files that describe each individual simulation we will run are located within
+`network_setup/transformations/`. Each JSON file represents a single alchemical
+leg to run and contains all the necessary information to run that leg.
+Filenames indicate ligand names as taken from the SDF; for example, the file
+`easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg
 associated with the tranformation of the ligand `lig_ejm_31` into `lig_ejm_42`
 while in complex with the protein.
 
+A single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files):
+one for the ligand in solvent, and one for the ligand complexed with the
+protein. The results from these two simulations can then be combined to obtained a single $\Delta\Delta G$ relative binding free energy value.
+
 Note that this specific setup makes a number of choices for you. All of
 these choices can be customized in the Python API. Here are the specifics on
 how these simulation are set up:
@@ -103,13 +108,15 @@ how these simulation are set up:
 
 ## Customize your campaign setup
 
-OpenFE contains many different options and methods for setting up a simulation campaign. 
-The options can be easily accessed and modified by providing a settings 
+OpenFE contains many different options and methods for setting up a simulation campaign.
+The options can be easily accessed and modified by providing a settings
 file in the `.yaml` format.
-Let's assume you want to exchange the LOMAP atom mapper with the Kartograf 
+Let's assume you want to exchange the LOMAP atom mapper with the Kartograf
 atom mapper and the Minimal Spanning Tree
 Network Planner with the Maximal Network Planner, then you could do the following:
+
 1. provide a file like `settings.yaml` with the desired changes:
+
 ```yaml
 mapper:
   method: kartograf
@@ -119,6 +126,7 @@ network:
 ```
 
 2. Plan your rbfe network with an additional `-s` flag for passing the settings:
+
 ```bash
 openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -s settings.yaml
 ```
@@ -142,9 +150,9 @@ Using Options:
         Networker: functools.partial(<function generate_maximal_network at 0x7fea18371260>)
 ```
 
-That concludes the straightforward process of tailoring your OpenFE setup to your specifications. 
-Additionally, we've provided a snippet for generating YAML files with 
-various of the current options for your convenience. 
+That concludes the straightforward process of tailoring your OpenFE setup to your specifications.
+Additionally, we've provided a snippet for generating YAML files with
+various of the current options for your convenience.
 
 Option Examples:
 
@@ -160,7 +168,6 @@ network:
   # method: generate_minimal_redundant_network
 ```
 
-
 **Customize away!**
 
 ## Running the simulations
@@ -279,31 +286,31 @@ gather` command from within the working directory used above:
 openfe gather results/ --report dg -o final_results.tsv
 ```
 
-This will write out a tab-separated table of results where the results 
+This will write out a tab-separated table of results where the results
 reported are controlled by the `--report` option:
 
-  * `dg` (default) reports the ligand and the results are the maximum
-    likelihood estimate of its absolute free, and the associated 
-    uncertainty from DDG replica averages and standard deviations.
-  * `ddg` reports pairs of `ligand_i` and `ligand_j`, the calculated
-    relative free energy `DDG(i->j) = DG(j) - DG(i)` and its uncertainty.
-  * `raw` reports the raw results, giving the leg (`vacuum`, `solvent`, or
-    `complex`), `ligand_i`, `ligand_j`, the raw `DG(i->j)` associated with it.
-
+- `dg` (default) reports the ligand and the results are the maximum
+ likelihood estimate of its absolute free, and the associated
+ uncertainty from DDG replica averages and standard deviations.
+- `ddg` reports pairs of `ligand_i` and `ligand_j`, the calculated
+ relative free energy `DDG(i->j) = DG(j) - DG(i)` and its uncertainty.
+- `raw` reports the raw results, giving the leg (`vacuum`, `solvent`, or
+ `complex`), `ligand_i`, `ligand_j`, the raw `DG(i->j)` associated with it.
 
-The resulting file looks something like this:
+The resulting file (`final_results.tsv`) will look something like this:
 
 <!-- take top lines from `cat final_results.tsv` -->
 
 ```text
-lig_ejm_31	-0.21	0.06
-lig_ejm_42	0.63	0.08
-lig_ejm_46	-0.80	0.07
-lig_ejm_47	-0.1	0.2
-lig_ejm_48	0.6	0.3
-lig_ejm_50	1.0	0.1
-lig_ejm_43	1.9	0.1
-lig_jmc_23	-0.94	0.09
-lig_jmc_27	-0.91	0.09
-lig_jmc_28	-1.2	0.1
+ligand	DG(MLE) (kcal/mol)	uncertainty (kcal/mol)
+lig_ejm_31	-0.09	0.05
+lig_ejm_42	0.7	0.1
+lig_ejm_46	-0.98	0.05
+lig_ejm_47	-0.1	0.1
+lig_ejm_48	0.53	0.09
+lig_ejm_50	0.91	0.06
+lig_ejm_43	2.0	0.2
+lig_jmc_23	-0.68	0.09
+lig_jmc_27	-1.1	0.1
+lig_jmc_28	-1.25	0.08
 ```