Unity ML - Mower

gif

Introduction

The environment

Machine Learning

We started our quest with the ppo jupyter notebook.

The inputs

Our mower has 8 sensors around itself.

It has 3 types of sensors :

the distance of the first clod
the number of clods along the sensor
the distance of an obstable

So our inputs is just a vector of 24.

The rewards and punishements

We tried different rewards and punishments for our mower. The best that we found for the model is this :

each clod mowned : +0.1
all clods mowned : +10
a rock touched : -1

We will try to add a little punishment for each frame.

The decisions

A mower can go forward, backward, right and left. So the agent needs to return a vector of 2 (continious values). First value is for speed and the second for the rotation. Examples :

[-1, 0] : backward
[0.5,0.75] : forward and right

hyperparameters and model

For the training succeed, we tuned the hyperparameters.

max_steps = 1e7
num_layers = 4
buffer_size = 10000
learning_rate = 1e-6
hidden_units = 1024
batch_size = 2000

Increase the buffer_size and the batch_size was a good idea from this issue : Unity-Technologies/ml-agents#288

The training

We launch several training. We turned hyperparameters and even change the environment for it to be more easier for the agent to learn. Since we were not able to do the training on a GPU, it was very long to change something and have the result...

But we trained an agent able to mown clods and avoid rocks ! Here is the mean rewards during the training :

Note : the three peaks means that the agent win the game. Because if all clods are mowned, we give +10 rewards to the agent.

Conclusion

It was very fun for a first start ! We thank Unity guys who developed this new feature. We'll keep to work with it even after the end of the challenge :)

Improvements

try the training with 2 or 3 rocks
add a little punishment to each frame
try with a lower model (layers and hidden units)
try with a final score only
try with more agents
try a custom model (not ppo)
try with moving obstacles
fix the tensorboard
fix the training on a VM linux machine with GPU in the cloud
try to train the model with pixels

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
docs		docs
models		models
unity-environment		unity-environment
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unity ML - Mower

Introduction

The environment

Machine Learning

The inputs

The rewards and punishements

The decisions

hyperparameters and model

The training

Conclusion

Improvements

About

Releases

Packages

Contributors 2

Languages

jonathandefraiteur/unity-ml-mower

Folders and files

Latest commit

History

Repository files navigation

Unity ML - Mower

Introduction

The environment

Machine Learning

The inputs

The rewards and punishements

The decisions

hyperparameters and model

The training

Conclusion

Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages