Skip to content

Latest commit

 

History

History
 
 

side_effects_penalties

Side effects penalties

Side effects are unnecessary disruptions to the agent's environment while completing a task. Instead of trying to explicitly penalize all possible side effects, we give the agent a general penalty for impacting the environment, defined as a deviation from some baseline state. For example, a reversibility penalty measures unreachability (deviation) of the starting state (baseline). This code implements a tabular Q-learning agent with different impact penalties. Each penalty consists of a deviation measure (none, unreachability, relative reachability, or attainable utility), a baseline (starting state, inaction, or stepwise inaction), and some other design choices. This is the code for the paper Penalizing side effects using stepwise relative reachability by Krakovna et al (2019).

Instructions

Clone the repository:

git clone https://github.com/deepmind/deepmind-research/side_effects_penalties.git

Running an agent with a side effects penalty

Run the agent with a given penalty on an AI Safety Gridworlds environment:

python -m side_effects_penalties.run_experiment -baseline <X> -dev_measure <Y> -env_name <Z> -suffix <S>

The following parameters can be specified for the side effects penalty:

  • Baseline state (-baseline): starting state (start), inaction (inaction), stepwise inaction with rollouts (stepwise), stepwise inaction without rollouts (step_noroll)
  • Deviation measure (-dev_measure): none (none), unreachability (reach), relative reachability (rel_reach), attainable utility (att_util)
  • Discount factor for the deviation measure value function (-value_discount)
  • Summary function to apply to the relative reachability or attainable utility deviation measure (-dev_fun): max (0, x) (truncation) or |x| (absolute)
  • Weight for the side effects penalty relative to the reward (-beta)

Other arguments:

  • AI Safety Gridworlds environment name (-env_name)
  • Number of episodes (-num_episodes)
  • Filename suffix for saving result files (-suffix)

Plotting the results

Make a summary data frame from the result files generated by run_experiment:

python -m side_effects_penalties.results_summary -compare_penalties -input_suffix <S>

Arguments:

  • -bar_plot: make a data frame for a bar plot (True) or learning curve plot (False)
  • -compare_penalties: compare different penalties using the best beta value for each penalty (True), or compare different beta values for a given penalty (False)
  • If compare_penalties=False, specify the penalty parameters (-dev_measure, -dev_fun and -value_discount)
  • Environment name (-env_name)
  • Filename suffix for loading result files (-input_suffix)
  • Filename suffix for the summary data frame (-output_suffix)

Import the summary data frame into plot_results.ipynb and make a bar plot or learning curve plot.

Dependencies

  • Python 2.7 or 3 (tested with Python 2.7.15 and 3.6.7)
  • AI Safety Gridworlds suite of safety environments
  • Abseil Python common libraries
  • Numpy
  • Pandas
  • Six
  • Matplotlib
  • Seaborn

Citing this work

If you use this code in your work, please cite the accompanying paper:

@article{srr2019, title = {Penalizing Side Effects using Stepwise Relative Reachability}, author = {Victoria Krakovna and Laurent Orseau and Ramana Kumar and Miljan Martic and Shane Legg}, journal = {CoRR}, volume = {abs/1806.01186}, year = {2019} }