Skip to content

mguarin0/rl_graph_generation

This branch is 3 commits ahead of bowenliu16/rl_graph_generation:master.

Repository files navigation

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

This repository is the official Tensorflow implementation of "Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation".

Jiaxuan You*, Bowen Liu*, Rex Ying, Vijay Pande, Jure Leskovec, Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Installation

  • Install rdkit, please refer to the offical website for further details, using anaconda is recommended:
conda create -c rdkit -n my-rdkit-env rdkit
  • Install mpi4py, networkx:
conda install -y mpi4py
pip install networkx==1.11
pip install matplotlib==3.5.3
  • Install OpenAI baseline dependencies:
pip install -e rl-baselines
  • Install customized molecule gym environment:
pip install -e gym-molecule

Alternative Install instructions

For greater reproducibility a Dockerfile is provided that can be used to build an environment in which this code simply runs. To use this container environment download and install Docker. To more easily reproduce the build and run steps install Make and run the recipes provided in the Makefile.

  • To build the Docker container:
make build
  • To run this Docker container with/out a gpu while mounting the project into the running container:
make run_dev_{gpu}
  • Enter the running container:
docker exec -it <container id> bash
  • Activate conda environment and install project dependencies:
conda activate my-rdkit-env
bash install.sh

Code description

There are 4 important files:

  • run_molecule.py is the main code for running the program. You may tune all kinds of hyper-parameters there.
  • The molecule environment code is in gym-molecule/gym_molecule/envs/molecule.py.
  • RL related code is in rl-baselines/baselines/ppo1 folder: gcn_policy.py is the GCN policy network; pposgd_simple_gcn.py is the PPO algorithm specifically tuned for GCN policy.

Run

  • single process run
python run_molecule.py
  • mutiple processes run
mpirun -np 8 python run_molecule.py 2>/dev/null

2>/dev/null will hide the warning info provided by rdkit package.

We highly recommend using tensorboard to monitor the training process. To do this, you may run

tensorboard --logdir runs

All the generated molecules along the training process are stored in the molecule_gen folder, each run configuration is stored in a different csv file. Molecules are stored using SMILES strings, along with the desired properties scores.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.4%
  • Jupyter Notebook 22.4%
  • Other 0.2%