diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index 4e87e81..4bfdff3 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -1,55 +1,58 @@ # Relative Free Energies with the OpenFE CLI -This tutorial will show how to use the OpenFE command line interface to get -free energies -- with no Python at all! This will work for simple setups, but you -may need to use the Python interface for more complicated setups. +This tutorial will show how to use the OpenFE CLI (Command Line Interface) to calculate +free energies - with no Python at all! This CLI works for simple setups, but you +may need to use the Python API for more complicated setups. The entire process of running the campaign of simulations is split into 3 stages, each of which corresponds to a CLI command: -1. Setting up the necessary files to describe each of the individual - simulations to run +1. Setting up the files necessary to run each of the simulations 2. Running the simulations -3. Gathering the results of separate simulations into a single table +3. Gathering the results of the simulations into a single table -To work through this tutorial, start out with a fresh directory. You can download the tutorial materials (including this file) using the command: +To work through this tutorial, start out with a fresh directory. You can download the tutorial materials (including these instructions) using the command: ```bash openfe fetch rbfe-tutorial ``` -Then when you run `ls`, you should see that your directory has this file, -`cli_tutorial.md`, a notebook called `python_tutorial.ipynb`, and files with -the molecules we'll use in this tutorial: `tyk2_ligands.sdf` and -`tyk2_protein.pdb`. +Then when you run `ls`, you should see that your directory has: + +- `cli_tutorial.md`: the file containing these instructions +- `python_tutorial.ipynb`: a notebook detailing how to do this analysis using the Python API, instead of the CLI shown here. +- `tyk2_ligands.sdf` and `tyk2_protein.pdb` : files containing the molecules we'll use in this tutorial. ## Setting up the campaign -The CLI makes setting up the simulation very easy -- it's just a single CLI +The CLI makes setting up the simulation very easy - it's just a single CLI command. There are separate commands for relative binding free energy (RBFE) and relative hydration free energy setups (RHFE). For RBFE campaigns, the relevant command is `openfe plan-rbfe-network`. For RHFE, the command is `openfe plan-rhfe-network`. They work mostly the same, except that the RHFE planner does not take a protein. In this tutorial, we'll -do an RBFE calculation. The only difference for RBFE is in the setup stage -- +do an RBFE calculation. The only difference for RHFE is in the setup stage - running the simulations and gathering the results are the same. -To run the command, we do the following: - * Read all the ligands from the SDF by giving - the option `-M tyk2_ligands.sdf`. You can also use `-M` with a directory, and - it will load all molecules found in any SDF or MOL2 file in that directory. - * Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`. - * Instruct `openfe` to output files into a directory called `network_setup` - with the `-o network_setup` option. +With the single command: ```bash openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup ``` +we do the following: + +- Read all the ligands from the SDF by giving + the option `-M tyk2_ligands.sdf`. You can also use `-M` with a directory, and + it will load all molecules found in any SDF or MOL2 file in that directory. +- Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`. +- Instruct `openfe` to output files into a directory called `network_setup` + with the `-o network_setup` option. + Planning the campaign may take some time, as it tries to find the best network from all possible transformations. This will create a directory called -`network_setup`, which is structured like this: +`network_setup/`, which is structured like this: @@ -57,7 +60,7 @@ network from all possible transformations. This will create a directory called network_setup ├── ligand_network.graphml ├── network_setup.json -└── transformations +└── transformations/ ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json ├── easy_rbfe_lig_ejm_31_complex_lig_ejm_47_complex.json @@ -79,16 +82,18 @@ This opens an interactive viewer. You can move the ligand names around to get a better view of the structure, and if you click on the edge, you will see the mapping for that edge. -The files that describe each individual simulation we will run are located in the -`transformations` subdirectory. Each JSON file represents a single alchemical -leg to run, and contains all the necessary information to run that leg. A -single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files): -one for the ligand in solvent, and one for the ligand complexed with the -protein. The results from these two simulations can then be combined to obtained a single $\Delta\Delta G$ relative binding free energy value. Filenames indicate ligand names as taken from the SDF; for example, -the file `easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg +The files that describe each individual simulation we will run are located within +`network_setup/transformations/`. Each JSON file represents a single alchemical +leg to run and contains all the necessary information to run that leg. +Filenames indicate ligand names as taken from the SDF; for example, the file +`easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg associated with the tranformation of the ligand `lig_ejm_31` into `lig_ejm_42` while in complex with the protein. +A single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files): +one for the ligand in solvent, and one for the ligand complexed with the +protein. The results from these two simulations can then be combined to obtained a single $\Delta\Delta G$ relative binding free energy value. + Note that this specific setup makes a number of choices for you. All of these choices can be customized in the Python API. Here are the specifics on how these simulation are set up: @@ -103,13 +108,15 @@ how these simulation are set up: ## Customize your campaign setup -OpenFE contains many different options and methods for setting up a simulation campaign. -The options can be easily accessed and modified by providing a settings +OpenFE contains many different options and methods for setting up a simulation campaign. +The options can be easily accessed and modified by providing a settings file in the `.yaml` format. -Let's assume you want to exchange the LOMAP atom mapper with the Kartograf +Let's assume you want to exchange the LOMAP atom mapper with the Kartograf atom mapper and the Minimal Spanning Tree Network Planner with the Maximal Network Planner, then you could do the following: + 1. provide a file like `settings.yaml` with the desired changes: + ```yaml mapper: method: kartograf @@ -119,6 +126,7 @@ network: ``` 2. Plan your rbfe network with an additional `-s` flag for passing the settings: + ```bash openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -s settings.yaml ``` @@ -142,9 +150,9 @@ Using Options: Networker: functools.partial() ``` -That concludes the straightforward process of tailoring your OpenFE setup to your specifications. -Additionally, we've provided a snippet for generating YAML files with -various of the current options for your convenience. +That concludes the straightforward process of tailoring your OpenFE setup to your specifications. +Additionally, we've provided a snippet for generating YAML files with +various of the current options for your convenience. Option Examples: @@ -160,7 +168,6 @@ network: # method: generate_minimal_redundant_network ``` - **Customize away!** ## Running the simulations @@ -279,31 +286,31 @@ gather` command from within the working directory used above: openfe gather results/ --report dg -o final_results.tsv ``` -This will write out a tab-separated table of results where the results +This will write out a tab-separated table of results where the results reported are controlled by the `--report` option: - * `dg` (default) reports the ligand and the results are the maximum - likelihood estimate of its absolute free, and the associated - uncertainty from DDG replica averages and standard deviations. - * `ddg` reports pairs of `ligand_i` and `ligand_j`, the calculated - relative free energy `DDG(i->j) = DG(j) - DG(i)` and its uncertainty. - * `raw` reports the raw results, giving the leg (`vacuum`, `solvent`, or - `complex`), `ligand_i`, `ligand_j`, the raw `DG(i->j)` associated with it. - +- `dg` (default) reports the ligand and the results are the maximum + likelihood estimate of its absolute free, and the associated + uncertainty from DDG replica averages and standard deviations. +- `ddg` reports pairs of `ligand_i` and `ligand_j`, the calculated + relative free energy `DDG(i->j) = DG(j) - DG(i)` and its uncertainty. +- `raw` reports the raw results, giving the leg (`vacuum`, `solvent`, or + `complex`), `ligand_i`, `ligand_j`, the raw `DG(i->j)` associated with it. -The resulting file looks something like this: +The resulting file (`final_results.tsv`) will look something like this: ```text -lig_ejm_31 -0.21 0.06 -lig_ejm_42 0.63 0.08 -lig_ejm_46 -0.80 0.07 -lig_ejm_47 -0.1 0.2 -lig_ejm_48 0.6 0.3 -lig_ejm_50 1.0 0.1 -lig_ejm_43 1.9 0.1 -lig_jmc_23 -0.94 0.09 -lig_jmc_27 -0.91 0.09 -lig_jmc_28 -1.2 0.1 +ligand DG(MLE) (kcal/mol) uncertainty (kcal/mol) +lig_ejm_31 -0.09 0.05 +lig_ejm_42 0.7 0.1 +lig_ejm_46 -0.98 0.05 +lig_ejm_47 -0.1 0.1 +lig_ejm_48 0.53 0.09 +lig_ejm_50 0.91 0.06 +lig_ejm_43 2.0 0.2 +lig_jmc_23 -0.68 0.09 +lig_jmc_27 -1.1 0.1 +lig_jmc_28 -1.25 0.08 ```