Adapting the Function Approximation Architecture in Online Reinforcement Learning

Algorithms, RL environment and experiments were implemented by Fatima Davelouis (https://github.com/daveloui) and John Martin. The code makes the Frog's Eye environment fully accessible for others to run experiments and reproduces the results from the following paper:

Adapting the Function Approximation Architecture in Online Reinforcement Learning,
John D. Martin and Joseph Modayil (2021) [paper link].

The repository contains an implementation of a shallow network architecture wired with prediction adapted neighborhoods. Network output weights are learned with TD-Lambda (Sutton, 1988). The repo also includes implementations of the Linear and Random baselines, an implementation of the Frog's Eye environment, and code for generating plots similar to Figure 4 in the paper mentioned above.

Getting Started

Dependencies

Before running the code, be sure to install the following dependencies.

python 3.6
numpy
scipy
jax
gin
argparse
matplotlib
traceback
math
functools
itertools
multiprocessing

Running an experiment

This code can be run locally or on Compute Canada. The hyper-parameters used in Martin and Modayil (2021) can be found in config_relu_FrogsEye.gin.

Running an experiment locally

Run python run_locally.py on your home computer. You can choose to parallelize across configurations or not, as specified by the run_experiments.parallel_experiments parameter in config_relu_FrogsEye.gin.

Running an experiment on Compute Canada

Generate experiment configurations

run generate_txt.py. This will generate a txt file. Each line in the txt file is a specific experiment configuration.

Launch experiments

In order to run all configurations, we scheduled job arrays on the Beluga cluster in Compute Canada. Take a look at submit_experiments_FrogsEye_gpu_Beluga.sh.
You will need to replace your account name in the field #SBATCH --account.

If you wish to run a specific configuration (for example, the configuration specified in line 5 of the .txt file), then you must change the last few lines in submit_experiments_FrogsEye_gpu_Beluga.sh to:

EXE=cat <txt file name> | head -n 5 | tail -n 1
command="python main_parser.py $EXE"
eval $command

How to plot results:

Step 1: Specify architectures and hyper-parameters

Specify hyper-parameters in plotting_config.gin.

Step 2: Compute average binned squared return errors

python scripts/compute_stats.py

Step 3: Compute data table that stores the final averaged return error across all hyper-parameters

python scripts/generate_data_table.py

Step 4: Generate and save sensitivity curves

python scripts/plot_sensitivity_curves_step_size.py

Step 5: Generate and save learning curves

python scripts/plot_mean_squared_return_error.py

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
scripts		scripts
Frog_Eye_img.png		Frog_Eye_img.png
README.md		README.md
agent.py		agent.py
agent_env_interaction.py		agent_env_interaction.py
config_relu_FrogsEye.gin		config_relu_FrogsEye.gin
detect_missing_files.py		detect_missing_files.py
features.py		features.py
frogs_eye_env.py		frogs_eye_env.py
generate_txt.py		generate_txt.py
main_parser.py		main_parser.py
nexting_tdlambda.py		nexting_tdlambda.py
plotting_config.gin		plotting_config.gin
run_locally.py		run_locally.py
save_data.py		save_data.py
submit_experiments_FrogsEye_gpu_Beluga.sh		submit_experiments_FrogsEye_gpu_Beluga.sh
tdlambda.py		tdlambda.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adapting the Function Approximation Architecture in Online Reinforcement Learning

Getting Started

Dependencies

Running an experiment

Running an experiment locally

Running an experiment on Compute Canada

Generate experiment configurations

Launch experiments

How to plot results:

Step 1: Specify architectures and hyper-parameters

Step 2: Compute average binned squared return errors

Step 3: Compute data table that stores the final averaged return error across all hyper-parameters

Step 4: Generate and save sensitivity curves

Step 5: Generate and save learning curves

About

Releases

Packages

Contributors 2

Languages

jdmartin86/frogseye

Folders and files

Latest commit

History

Repository files navigation

Adapting the Function Approximation Architecture in Online Reinforcement Learning

Getting Started

Dependencies

Running an experiment

Running an experiment locally

Running an experiment on Compute Canada

Generate experiment configurations

Launch experiments

How to plot results:

Step 1: Specify architectures and hyper-parameters

Step 2: Compute average binned squared return errors

Step 3: Compute data table that stores the final averaged return error across all hyper-parameters

Step 4: Generate and save sensitivity curves

Step 5: Generate and save learning curves

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages