Merge branch 'main' into tidy/spring-clean

UoA-CARES · Dec 3, 2024 · e735431 · e735431
2 parents 624119c + 6c2fb64
commit e735431
Show file tree

Hide file tree

Showing 9 changed files with 258 additions and 134 deletions.
diff --git a/.mypy.ini b/.mypy.ini
@@ -0,0 +1,5 @@
+# Global options:
+
+[mypy]
+exclude = build
+disable_error_code = import-untyped
diff --git a/README.md b/README.md
@@ -15,9 +15,29 @@ Run `pip3 install -r requirements.txt` in the **root directory** of the package
 Install the environments dependent on pyboy here https://github.com/UoA-CARES/pyboy_environment
 
 # Usage
-This package is a basic example of running the CARES RL algorithms on OpenAI/DMCS. 
+This package is a basic example of running the CARES RL algorithms on OpenAI/DMCS/pyboy. 
 
-`train.py` takes in hyperparameters that allow you to customise the training gym enviromment – see options below - or RL algorithm. Use `python3 train.py -h` for help on what parameters are available for customisation.
+## Running Training and Evaluation
+The packagage is called using `run.py`. This takes in specific commands list below for training and evaluation purposes.
+
+Use `python3 run.py -h` for help on what parameters are available for customisation.
+
+### Train
+The train command in the run.py script is used to initiate the training process for reinforcement learning models within specified gym environments. This command can be customized using various hyperparameters to tailor the training environment and the RL algorithm. You can use python "run.py train cli -h" to view all available options for customization and start a run directly through the terminal. This flexibility enables users to experiment with different settings and optimize their models effectively.
+
+Specific and larger configuration changes can be loaded using python "run.py train config --data_path <PATH_TO_TRAINING_CONFIGS>", allowing for a more structured and repeatable training setup through configuration files. 
+
+```
+python run.py train cli -h
+python run.py train config --data_path <PATH_TO_TRAINING_CONFIGS>
+```
+
+### Evaluate
+The evaluate command is used to re-run the evaluation loops on a trained reinforcement learning model within a specified gym environment. By running python run.py evaluate --data_path <PATH_TO_TRAINING_DATA>, users can load the trained model and the corresponding training data to evaluate how well the model performs on the given task. 
+
+```
+python run.py evaluate --data_path <PATH_TO_TRAINING_DATA>
+```
 
 ## Gym Environments
 This package contains wrappers for the following gym environments:
@@ -26,7 +46,7 @@ This package contains wrappers for the following gym environments:
 The standard Deep Mind Control suite: https://github.com/google-deepmind/dm_control
 
 ```
-python3 train.py run --gym dmcs --domain ball_in_cup --task catch TD3
+python3 run.py train cli --gym dmcs --domain ball_in_cup --task catch TD3
 ```
 
 <p align="center">
@@ -37,9 +57,9 @@ python3 train.py run --gym dmcs --domain ball_in_cup --task catch TD3
 The standard OpenAI Gymnasium: https://github.com/Farama-Foundation/Gymnasium 
 
 ```
-python train.py run --gym openai --task CartPole-v1 DQN
+python run.py train cli --gym openai --task CartPole-v1 DQN
 
-python train.py run --gym openai --task HalfCheetah-v4 TD3
+python run.py train cli --gym openai --task HalfCheetah-v4 TD3
 ```
 
 <p align="center">
@@ -50,7 +70,7 @@ python train.py run --gym openai --task HalfCheetah-v4 TD3
 Environment running Gameboy games utilising the pyboy wrapper: https://github.com/UoA-CARES/pyboy_environment 
 
 ```
-python3 train.py run --gym pyboy --task mario NaSATD3
+python3 run.py train cli --gym pyboy --task mario SACAE
 ```
 
 <p align="center">
@@ -59,34 +79,55 @@ python3 train.py run --gym pyboy --task mario NaSATD3
 </p>
 
 # Data Outputs
-All data from a training run is saved into '~/cares_rl_logs'. A folder will be created for each training run named as 'ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS', e.g. 'TD3-HalfCheetah-v4-23_10_11_08:47:22'. This folder will contain the following directories and information saved during the training session:
-
-```
-├─ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS/
-├─ SEED
-|  ├─ env_config.py
-|  ├─ alg_config.py
-|  ├─ train_config.py
-|  ├─ data
-|  |  ├─ train.csv
-|  |  ├─ eval.csv
-|  ├─ figures
-|  |  ├─ eval.png
-|  |  ├─ train.png
-|  ├─ models
-|  |  ├─ model.pht
-|  |  ├─ CHECKPOINT_N.pht
-|  |  ├─ ...
-|  ├─ videos
-|  |  ├─ STEP.mp4
+All data from a training run is saved into the directory specified in the `CARES_LOG_BASE_DIR` environment variable. If not specified, this will default to `'~/cares_rl_logs'`.
+
+You may specify a custom log directory format using the `CARES_LOG_PATH_TEMPLATE` environment variable. This path supports variable interpolation such as the algorithm used, seed, date etc. This defaults to `"{algorithm}/{algorithm}-{domain_task}-{date}"`.
+
+This folder will contain the following directories and information saved during the training session:
+
+```text
+├─ <log_path>
+|  ├─ env_config.json
+|  ├─ alg_config.json
+|  ├─ train_config.json
+|  ├─ *_config.json
+|  ├─ ...
+|  ├─ SEED_N
+|  |  ├─ data
+|  |  |  ├─ train.csv
+|  |  |  ├─ eval.csv
+|  |  ├─ figures
+|  |  |  ├─ eval.png
+|  |  |  ├─ train.png
+|  |  ├─ models
+|  |  |  ├─ model.pht
+|  |  |  ├─ CHECKPOINT_N.pht
+|  |  |  ├─ ...
+|  |  ├─ videos
+|  |  |  ├─ STEP.mp4
+|  |  |  ├─ ...
+|  ├─ SEED_N
 |  |  ├─ ...
-├─ SEED...
-├─ ...
+|  ├─ ...
 ```
 
 # Plotting
-The plotting utility in https://github.com/UoA-CARES/cares_reinforcement_learning/ will plot the data contained in the training data. An example of how to plot the data from one or multiple training sessions together is shown below. Running 'python3 plotter.py -h' will provide details on the plotting parameters.
+The plotting utility in https://github.com/UoA-CARES/cares_reinforcement_learning/ will plot the data contained in the training data based on the format created by the Record class. An example of how to plot the data from one or multiple training sessions together is shown below.
 
+Running 'python3 plotter.py -h' will provide details on the plotting parameters and control arguments. You can custom set the font size and text for the title, and axis labels - defaults will be taken from the data labels in the csv files.
+
+```sh
+python3 plotter.py -h
 ```
-python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS
+
+Plot the results of a single training instance
+
+```sh
+python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM/ALGORITHM-TASK-YY_MM_DD:HH:MM:SS
+```
+
+Plot and compare the results of two or more training instances
+
+```sh
+python3 plotter.py -s ~/cares_rl_logs -d ~/cares_rl_logs/ALGORITHM_A/ALGORITHM_A-TASK-YY_MM_DD:HH:MM:SS ~/cares_rl_logs/ALGORITHM_B/ALGORITHM_B-TASK-YY_MM_DD:HH:MM:SS
 ```
diff --git a/requirements.txt b/requirements.txt
@@ -8,3 +8,4 @@ pydantic==1.10.13
 torch==2.3.1
 pyboy==2.2.1
 mediapy==1.1.9
+natsort==8.4.0
diff --git a/scripts/environments/environment_factory.py b/scripts/environments/environment_factory.py
@@ -2,9 +2,9 @@
 
 from environments.dmcs.dmcs_environment import DMCSEnvironment
 from environments.gym_environment import GymEnvironment
-from environments.pyboy.pyboy_environment import PyboyEnvironment
 from environments.image_wrapper import ImageWrapper
 from environments.openai.openai_environment import OpenAIEnvironment
+from environments.pyboy.pyboy_environment import PyboyEnvironment
 from util.configurations import GymEnvironmentConfig
 
 
@@ -14,14 +14,15 @@ def __init__(self) -> None:
 
     def create_environment(
         self, config: GymEnvironmentConfig, image_observation
-    ) -> GymEnvironment:
+    ) -> GymEnvironment | ImageWrapper:
         logging.info(f"Training Environment: {config.gym}")
+
         if config.gym == "dmcs":
-            env = DMCSEnvironment(config)
+            env: GymEnvironment = DMCSEnvironment(config)
         elif config.gym == "openai":
-            env = OpenAIEnvironment(config)
+            env: GymEnvironment = OpenAIEnvironment(config)
         elif config.gym == "pyboy":
-            env = PyboyEnvironment(config)
+            env: GymEnvironment = PyboyEnvironment(config)
         else:
             raise ValueError(f"Unkown environment: {config.gym}")
         return ImageWrapper(config, env) if bool(image_observation) else env