A reinforcement learning agent( double-jointed arm) trained to maintain its position toward a target in a continuous environment.
For this project I am using the Reacher environment which simulates Double-jointed arm that can move to target locations.
With :
- the observation has 30 variables about measurements such as velocities , angular velocities .... of the arm .
- The action space is 4 dimentional vector , action = [x1 ,x2, x3, x4] where xi ∈ [-1, +1] with i ∈ {1,2,3,4}
- The rewarding strategy : the agent receives +0.1 if it is in the goal( target) direction and nothing otherwise
Thus the goal is to maintain the position of the arm toward the target for as many time steps as possible.
This taks is considered solved if we reach an average reward of +30.0 over 100 episodes or more.
If you wish to reproduce this work you need to setup the enviornement by following this section :
git clone https://github.com/ZSoumia/Continous-control-Agent
Please follow instructions from this repo
Select the Unity environement based on your opertaing system :
Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the "headless" version of the environment. You will not be able to watch the agent without enabling a virtual screen, but you will be able to train the agent. (To watch the agent, you should follow the instructions to enable a virtual screen, and then download the environment for the Linux operating system above.)
==> Place the downloaded file into your cloned project file .
- The Agent.py file contains the general structure of the Reinforcement learning agent .
- The Actor.py contains the actor's network code .
- Critic.py contains the critic's network code.
- Continuous_control.ipynb is the notebook used to train and evaluate the agent.
- Continuous control Report.html is a report about the different aspects of this project.