Skip to content

Commit

Permalink
Merge pull request #9 from MolecularAI/3.1.0_docs
Browse files Browse the repository at this point in the history
Update docs for 3.1.0
  • Loading branch information
lewismervin1 authored Jul 2, 2024
2 parents 330f907 + c059a9e commit 4298d58
Show file tree
Hide file tree
Showing 84 changed files with 135 additions and 11,591 deletions.
Binary file modified docs/sphinx-builddir/doctrees/README.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/algorithms.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/deduplicator.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/descriptors.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/modules.doctree
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
"source": [
"To use QSARtuna from Jupyter Notebook, install it with:\n",
"```\n",
"python -m pip install http://pages.scp.astrazeneca.net/mai/qsartuna/releases/QSARtuna_latest.tar.gz\n",
"python -m pip install https://github.com/MolecularAI/QSARtuna/releases/download/3.1.0/qsartuna-3.1.0.tar.gz\n",
"```"
]
},
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Diff not rendered.
Diff not rendered.
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/notebooks/preprocess_data.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/optunaz.config.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/optunaz.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/optunaz.utils.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/optunaz.utils.enums.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/splitters.doctree
Binary file not shown.
Binary file modified docs/sphinx-builddir/doctrees/transform.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/sphinx-builddir/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 9b1af61dfe8db8b73ea495bbe6736b6a
config: 74e083516866086104039ea37495342c
tags: 645f666f9bcd5a90fca523b33c5a78b7
86 changes: 57 additions & 29 deletions docs/sphinx-builddir/html/README.html

Large diffs are not rendered by default.

Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
80 changes: 57 additions & 23 deletions docs/sphinx-builddir/html/_sources/README.md.txt
Original file line number Diff line number Diff line change
@@ -1,15 +1,8 @@
# QSARtuna: QSAR using Optimization for Hyperparameter Tuning (formerly Optuna AZ and known publically as QSARtuna)
# QSARtuna 𓆛: QSAR using Optimization for Hyperparameter Tuning (formerly Optuna AZ and QPTUNA)

Build predictive models for CompChem with hyperparameters optimized by [Optuna](https://optuna.org/).

Internal AZ links:
[Docs](https://pages.scp.astrazeneca.net/mai/qsartuna),
[Code](https://github.com/AZU-RDIT/optuna_az/),
[Issues](https://jira.astrazeneca.com/projects/OPTUNA).

External (QSARtuna) links:
[Public](https://github.com/MolecularAI/QSARtuna/blob/master/README.md?_sm_nck=1),
[Public docs](https://molecularai.github.io/QSARtuna/).
Developed with Uncertainty Quantification and model explainability in mind.

## Background

Expand All @@ -21,6 +14,12 @@ for the given data.
The search itself
is done using [Optuna](https://optuna.org/).

Developed models employ
the latest state-of-the-art
uncertainty estimation and
explainability python packages

Further documentation in the GitHub pages [here](https://molecularai.github.io/QSARtuna/).

### The three-step process

Expand All @@ -41,7 +40,7 @@ QSARtuna is structured around three steps:
but it has a big benefit that this final model is trained on the all available data.


## JSON-based Command-line interface on AZ SCP
## JSON-based Command-line interface

Let's look at a trivial example of modelling molecular weight
using a training set of 50 molecules.
Expand All @@ -55,8 +54,7 @@ It contains four main sections:
* **descriptors** - which molecular descriptors to use.
* **algorithms** - which ML algorithms to use.

Below is the example of such file
(it is also available in the [source code repository](https://github.com/AZU-RDIT/optuna_az/blob/master/examples/optimization/regression.json)):
Below is the example of such a file

```json
{
Expand Down Expand Up @@ -120,8 +118,8 @@ Below is the example of such file
```

Data section specifies location of the dataset file.
In this example it specifies a relative path to the `tests/data` folder
in the [source code repository](https://github.com/AZU-RDIT/optuna_az/tree/master/tests/data/DRD2/subset-50).
In this example it specifies a relative path to the `tests/data` folder.


Settings section specifies that:
* we are building a regression model,
Expand All @@ -135,9 +133,9 @@ and optimization is free to pair any specified descriptor with any of the algori

When we have our data and our configuration, it is time to start the optimization.

### Running on SCP
### Running via singulartity

QSARtuna is deployed on SCP using [Singularity](https://sylabs.io/guides/3.7/user-guide/index.html) container.
QSARtuna can be deployed using [Singularity](https://sylabs.io/guides/3.7/user-guide/index.html) container.

To run commands inside the container, Singularity uses the following syntax:
```shell
Expand Down Expand Up @@ -177,12 +175,12 @@ We can submit our script to the queue by giving `sbatch` the following script:
# The example we use is in examples/optimization/regression_drd2_50.json.

# The example we chose uses relative paths to data files, change directory.
cd /projects/cc/mai/OptunaAZ-versions/OptunaAZ_latest
cd /{project_folder}/OptunaAZ-versions/OptunaAZ_latest

singularity exec \
/projects/cc/mai/containers/QSARtuna_latest.sif \
/{project_folder}/containers/QSARtuna_latest.sif \
/opt/qsartuna/.venv/bin/qsartuna-optimize \
--config examples/optimization/regression_drd2_50.json \
--config{project_folder}/examples/optimization/regression_drd2_50.json \
--best-buildconfig-outpath ~/qsartuna-target/best.json \
--best-model-outpath ~/qsartuna-target/best.pkl \
--merged-model-outpath ~/qsartuna-target/merged.pkl
Expand All @@ -195,7 +193,7 @@ When the script is complete, it will create pickled model files inside your home

When the model is built, run inference:
```shell
singularity exec /projects/cc/mai/containers/QSARtuna_latest.sif \
singularity exec /{project_folder}/containers/QSARtuna_latest.sif \
/opt/qsartuna/.venv/bin/qsartuna-predict \
--model-file target/merged.pkl \
--input-smiles-csv-file tests/data/DRD2/subset-50/test.csv \
Expand All @@ -211,7 +209,7 @@ This can be specified by modifying the above command and supplying

E.g:
```shell
singularity exec /projects/cc/mai/containers/QSARtuna_2.5.1.sif \
singularity exec /{project_folder}/containers/QSARtuna_2.5.1.sif \
/opt/qsartuna/.venv/bin/qsartuna-predict \
--model-file 2.5.1_model.pkl \
--input-smiles-csv-file tests/data/DRD2/subset-50/test.csv \
Expand All @@ -221,6 +219,42 @@ singularity exec /projects/cc/mai/containers/QSARtuna_2.5.1.sif \

would generate predictions for a model trained with QSARtuna 2.5.1.

### Optional: inspect
To inspect performance of different models tried during optimization,
use [MLFlow Tracking UI](https://www.mlflow.org/docs/latest/tracking.html):
```bash
module load mlflow
mlflow ui
```

Then open mlflow link your browser.

![mlflow select experiment](docs/images/mlflow-select-experiment.png)

If you run `mlflow ui` on SCP,
you can forward your mlflow port
with a separate SSH session started on your local ("non-SCP") machine:
```bash
ssh -N -L localhost:5000:localhost:5000 [email protected]
```
("-L" forwards ports, and "-N" just to not execute any commands).

In the MLFlow Tracking UI, select experiment to the left,
it is named after the input file path.
Then select all runs/trials in the experiment, and choose "Compare".
You will get a comparison page for selected runs/trials in the experiment.

![mlflow inspecting trials](docs/images/mlflow-inspecting-trials.png)

Comparison page will show MLFlow Runs (called Trials in Optuna),
as well as their Parameters and Metrics.
At the bottom there are plots.
For X-axis, select "trial_number".
For Y-axis, start with "optimization_objective_cvmean_r2".

You can get more details by clicking individual runs.
There you can access run/trial build (training) configuration.


## Run from Python/Jupyter Notebook

Expand All @@ -232,7 +266,7 @@ conda create --name my_env_with_qsartuna python=3.10.10 jupyter pip
conda activate my_env_with_qsartuna
module purge # Just in case.
which python # Check. Should output path that contains "my_env_with_qsartuna".
python -m pip install http://pages.scp.astrazeneca.net/mai/qsartuna/releases/QSARtuna_latest.tar.gz
python -m pip install https://github.com/MolecularAI/QSARtuna/files/14742594/qsartuna-3.0.0.1.tar.gz
```

Then you can use QSARtuna inside your Notebook:
Expand Down Expand Up @@ -299,4 +333,4 @@ build_best(buildconfig, "target/best.pkl")
##
# Build (Train) and save the model on the merged train+test data.
build_merged(buildconfig, "target/merged.pkl")
```
```
Loading

0 comments on commit 4298d58

Please sign in to comment.