Skip to content

Commit

Permalink
Merge branch 'main' into tidy/spring-clean
Browse files Browse the repository at this point in the history
  • Loading branch information
beardyFace committed Dec 3, 2024
2 parents 624119c + 6c2fb64 commit e735431
Show file tree
Hide file tree
Showing 9 changed files with 258 additions and 134 deletions.
5 changes: 5 additions & 0 deletions .mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Global options:

[mypy]
exclude = build
disable_error_code = import-untyped
101 changes: 71 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,29 @@ Run `pip3 install -r requirements.txt` in the **root directory** of the package
Install the environments dependent on pyboy here https://github.com/UoA-CARES/pyboy_environment

# Usage
This package is a basic example of running the CARES RL algorithms on OpenAI/DMCS.
This package is a basic example of running the CARES RL algorithms on OpenAI/DMCS/pyboy.

`train.py` takes in hyperparameters that allow you to customise the training gym enviromment – see options below - or RL algorithm. Use `python3 train.py -h` for help on what parameters are available for customisation.
## Running Training and Evaluation
The packagage is called using `run.py`. This takes in specific commands list below for training and evaluation purposes.

Use `python3 run.py -h` for help on what parameters are available for customisation.

### Train
The train command in the run.py script is used to initiate the training process for reinforcement learning models within specified gym environments. This command can be customized using various hyperparameters to tailor the training environment and the RL algorithm. You can use python "run.py train cli -h" to view all available options for customization and start a run directly through the terminal. This flexibility enables users to experiment with different settings and optimize their models effectively.

Specific and larger configuration changes can be loaded using python "run.py train config --data_path <PATH_TO_TRAINING_CONFIGS>", allowing for a more structured and repeatable training setup through configuration files.

```
python run.py train cli -h
python run.py train config --data_path <PATH_TO_TRAINING_CONFIGS>
```

### Evaluate
The evaluate command is used to re-run the evaluation loops on a trained reinforcement learning model within a specified gym environment. By running python run.py evaluate --data_path <PATH_TO_TRAINING_DATA>, users can load the trained model and the corresponding training data to evaluate how well the model performs on the given task.

```
python run.py evaluate --data_path <PATH_TO_TRAINING_DATA>
```

## Gym Environments
This package contains wrappers for the following gym environments:
Expand All @@ -26,7 +46,7 @@ This package contains wrappers for the following gym environments:
The standard Deep Mind Control suite: https://github.com/google-deepmind/dm_control

```
python3 train.py run --gym dmcs --domain ball_in_cup --task catch TD3
python3 run.py train cli --gym dmcs --domain ball_in_cup --task catch TD3
```

<p align="center">
Expand All @@ -37,9 +57,9 @@ python3 train.py run --gym dmcs --domain ball_in_cup --task catch TD3
The standard OpenAI Gymnasium: https://github.com/Farama-Foundation/Gymnasium

```
python train.py run --gym openai --task CartPole-v1 DQN
python run.py train cli --gym openai --task CartPole-v1 DQN
python train.py run --gym openai --task HalfCheetah-v4 TD3
python run.py train cli --gym openai --task HalfCheetah-v4 TD3
```

<p align="center">
Expand All @@ -50,7 +70,7 @@ python train.py run --gym openai --task HalfCheetah-v4 TD3
Environment running Gameboy games utilising the pyboy wrapper: https://github.com/UoA-CARES/pyboy_environment

```
python3 train.py run --gym pyboy --task mario NaSATD3
python3 run.py train cli --gym pyboy --task mario SACAE
```

<p align="center">
Expand All @@ -59,34 +79,55 @@ python3 train.py run --gym pyboy --task mario NaSATD3
</p>

# Data Outputs
All data from a training run is saved into '~/cares_rl_logs'. A folder will be created for each training run named as 'ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS', e.g. 'TD3-HalfCheetah-v4-23_10_11_08:47:22'. This folder will contain the following directories and information saved during the training session:

```
├─ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS/
├─ SEED
| ├─ env_config.py
| ├─ alg_config.py
| ├─ train_config.py
| ├─ data
| | ├─ train.csv
| | ├─ eval.csv
| ├─ figures
| | ├─ eval.png
| | ├─ train.png
| ├─ models
| | ├─ model.pht
| | ├─ CHECKPOINT_N.pht
| | ├─ ...
| ├─ videos
| | ├─ STEP.mp4
All data from a training run is saved into the directory specified in the `CARES_LOG_BASE_DIR` environment variable. If not specified, this will default to `'~/cares_rl_logs'`.

You may specify a custom log directory format using the `CARES_LOG_PATH_TEMPLATE` environment variable. This path supports variable interpolation such as the algorithm used, seed, date etc. This defaults to `"{algorithm}/{algorithm}-{domain_task}-{date}"`.

This folder will contain the following directories and information saved during the training session:

```text
├─ <log_path>
| ├─ env_config.json
| ├─ alg_config.json
| ├─ train_config.json
| ├─ *_config.json
| ├─ ...
| ├─ SEED_N
| | ├─ data
| | | ├─ train.csv
| | | ├─ eval.csv
| | ├─ figures
| | | ├─ eval.png
| | | ├─ train.png
| | ├─ models
| | | ├─ model.pht
| | | ├─ CHECKPOINT_N.pht
| | | ├─ ...
| | ├─ videos
| | | ├─ STEP.mp4
| | | ├─ ...
| ├─ SEED_N
| | ├─ ...
├─ SEED...
├─ ...
| ├─ ...
```

# Plotting
The plotting utility in https://github.com/UoA-CARES/cares_reinforcement_learning/ will plot the data contained in the training data. An example of how to plot the data from one or multiple training sessions together is shown below. Running 'python3 plotter.py -h' will provide details on the plotting parameters.
The plotting utility in https://github.com/UoA-CARES/cares_reinforcement_learning/ will plot the data contained in the training data based on the format created by the Record class. An example of how to plot the data from one or multiple training sessions together is shown below.

Running 'python3 plotter.py -h' will provide details on the plotting parameters and control arguments. You can custom set the font size and text for the title, and axis labels - defaults will be taken from the data labels in the csv files.

```sh
python3 plotter.py -h
```
python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS

Plot the results of a single training instance

```sh
python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS
```

Plot and compare the results of two or more training instances

```sh
python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM_A/ALGORITHM_A-TASK-YY_MM_DD:HH:MM:SS ~/cares_rl_logs/ALGORITHM_B/ALGORITHM_B-TASK-YY_MM_DD:HH:MM:SS
```
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ pydantic==1.10.13
torch==2.3.1
pyboy==2.2.1
mediapy==1.1.9
natsort==8.4.0
11 changes: 6 additions & 5 deletions scripts/environments/environment_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

from environments.dmcs.dmcs_environment import DMCSEnvironment
from environments.gym_environment import GymEnvironment
from environments.pyboy.pyboy_environment import PyboyEnvironment
from environments.image_wrapper import ImageWrapper
from environments.openai.openai_environment import OpenAIEnvironment
from environments.pyboy.pyboy_environment import PyboyEnvironment
from util.configurations import GymEnvironmentConfig


Expand All @@ -14,14 +14,15 @@ def __init__(self) -> None:

def create_environment(
self, config: GymEnvironmentConfig, image_observation
) -> GymEnvironment:
) -> GymEnvironment | ImageWrapper:
logging.info(f"Training Environment: {config.gym}")

if config.gym == "dmcs":
env = DMCSEnvironment(config)
env: GymEnvironment = DMCSEnvironment(config)
elif config.gym == "openai":
env = OpenAIEnvironment(config)
env: GymEnvironment = OpenAIEnvironment(config)
elif config.gym == "pyboy":
env = PyboyEnvironment(config)
env: GymEnvironment = PyboyEnvironment(config)
else:
raise ValueError(f"Unkown environment: {config.gym}")
return ImageWrapper(config, env) if bool(image_observation) else env
Loading

0 comments on commit e735431

Please sign in to comment.