Skip to content

HumanCompatibleAI/reducing-exploitability

Repository files navigation

Reducing Exploitability with Population Based Training

Code for the paper Reducing Exploitability with Population Based Training. We reduce exploitability by adversarial RL policies by training against a diverse population of opponents.

Setup

Should work with python 3.7, 3.8

Installation example

Install using Docker or using the following process.

conda create -n defense python=3.8

Install necessary packages:

pip install -r requirements.txt
pip install -r requirements-dev.txt

For generating videos:

conda install ffmpeg

ffmpeg can optionally also be installed with your systems package manager

Running Training

To change the output path change TrialSettings.out_path via gin-config. This can be overwritten with the environment variable POLICY_DEFENSE_OUT.

Configuration

Most frequently used settings can be changed via gin.
The settings intended to be configured with gin are:

  • TrialSettings (aprl_defense.trial.settings.TrialSettings)
  • RLSettings (aprl_defense.trial.settings.RLSettings)
  • Additionally, depending on whether one of these modes is used
    • selfplay (aprl_defense.training_managers.simple_training_manager.SelfplayTrainingManager)
    • single-agent - no additonal arguments
    • attack (aprl_defense.training_managers.simple_training_manager.AttackManager)
    • pbt (aprl_defense.training_managers.pbt_manager.PBTManager)

For further documentation on the configurable parameters check the Documentation of the respective classes.

Experiments for the paper were run with the settings in src/gin/icml.

To change hyperparameters we recommend creating RLlib configs that can be passed in via override / override_f gin settings.

Many experiments were run using dedicated python scripts, located in src/experiments.

The following examples should clarify how to specify training for different modes (run from src folder).

Selfplay Training

python -m aprl_defense.train \
  -f "gin/icml/selfplay/laser_tag.gin" \
  -p "TrialSettings.num_workers = 10" \
  -p "TrialSettings.wandb_group = 'experiment'"

Adversary Training

python -m aprl_defense.train \
  -f "gin/icml/attack/sp_laser_tag.gin" \
  -p "TrialSettings.num_workers = 10" \
  -p "TrialSettings.wandb_group = 'experiment'" \
  -p "attack.victim_artifact = '<wandb artifact id>'" \
  -p "attack.victim_policy_name = '<name of victim policy>' "

Population-Based Training

Attention: PBT only runs with the modified version of ray.

python -m aprl_defense.train \
  -f "gin/icml/pbt/laser_tag.gin" \
  -p "TrialSettings.num_workers = 10" \
  -p "TrialSettings.wandb_group = 'experiment'" \
  -p "pbt.main_id = 0" \
  -p "pbt.num_ops = 50" \
  -p "TrialSettings.num_workers = 50" '

Some Explanations

In all but the most basic setups creating an RLlib config for multiagent training requires programmatically creating a config in python and these configs could not be created simply by passing in a config file. For convenience the most commonly changed hyperparameters and set-up configurations can be changed with gin, additional modifications can be performed by overriding the RLlib config.