Landing on the moon with Asynchronous Advantage Actor Critic (A3C) Algorithm 🚀🌔

This repository contains the code of the project for the Lunar Landing Continuos v2 by OpenAI Gym

The purpose of this project is to create stability and high performance for Moon Landing using A3C based on Deep Q-Learning, one of the most well-known algorithms for Deep Reinforcement Learning and its improvements, test them on the Lunar Lander environment from OpenAI gym. The results of the experiments show that, while the extensions to the base algorithm improved its performances greatly, the simpleness of the environment prevents us from identifying a clear winner.

Read the paper-like report for the project here.

Dependencies

Lunar-Lander-Continuos-v2.ipynb

This file contains the main code, which creates the agents and launches them.

Optional Arguments

Parameters can be changed for the Actor and Critic networks by changing:

policy_net_args = {"num_Hlayers": 2,
                      "activations_Hlayers": ["relu", "relu"],
                      "Hlayer_sizes": [100, 100],
                      "n_output_units": 2,
                      "output_layer_activation": tf.nn.tanh,
                      "state_space_size": 8,
                      "action_space_size": 2,
                      "Entropy": 0.01,
                      "action_space_upper_bound": action_space_upper_bound,
                      "action_space_lower_bound": action_space_lower_bound,
                      "optimizer": tf.train.RMSPropOptimizer(0.0001),
                      "total_number_episodes": 5000,
                      "number_of_episodes_before_update": 1,
                      "frequency_of_printing_statistics": 100,
                      "frequency_of_rendering_episode": 1000,
                      "number_child_agents": 8,
                      "episodes_back": 20,
                      "gamma": 0.99,
                      "regularization_constant": 0.01,
                      "max_steps_per_episode": 2000

                      }

   valuefunction_net_args = {"num_Hlayers": 2,
                             "activations_Hlayers": ["relu", "relu"],
                             "Hlayer_sizes": [100, 64],
                             "n_output_units": 1,
                             "output_layer_activation": "linear",
                             "state_space_size": 8,
                             "action_space_size": 2,
                             "optimizer": tf.train.RMSPropOptimizer(0.01),
                             "regularization_constant": 0.01}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
img		img
LICENSE		LICENSE
Lunar-Lander-Continuos-v2.ipynb		Lunar-Lander-Continuos-v2.ipynb
README.md		README.md
minipaper.pdf		minipaper.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Landing on the moon with Asynchronous Advantage Actor Critic (A3C) Algorithm 🚀🌔

Dependencies

Lunar-Lander-Continuos-v2.ipynb

Optional Arguments

About

Sponsor this project

Languages

License

gnpaone/Lunar-Landing

Folders and files

Latest commit

History

Repository files navigation

Landing on the moon with Asynchronous Advantage Actor Critic (A3C) Algorithm 🚀🌔

Dependencies

Lunar-Lander-Continuos-v2.ipynb

Optional Arguments

About

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages