Reinforcement Learning in Scala

This repo contains the source code for the demos to accompany my talk 'Reinforcement Learning in Scala'.

The slides are available here.

The demos are availablehere.

Running locally

The demos are implemented using Scala.js, so first you need to build the JavaScript:

$ sbt fastOptJS

Next, start a simple web server of your choice. I use the Python one:

$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...

Finally open the site in your browser:

$ open localhost:8000

Pacman training

If you'd like to try your hand at making the Pacman agent smarter, the expected workflow looks something like this:

Update PacmanProblem.scala to improve the agent's state space, making it a more efficient learner.
Run the training harness:
```
$ sbt run
```
This will make the agent play a very large number of games of Pacman. It will run forever. Every 1 million time steps it will print out some stats to give an indicator of the agent's learning progress. Every five million time steps it will write the agent's Q-values to a JSON file in the pacman-training directory.
Once you have Q-values you are happy with, copy the JSON file to data/pacman/Q.json, overwriting the existing file.
Follow the steps above for running locally. Open the Pacman UI in your browser and watch your trained agent show those ghosts who's boss!

Hints

If you make your state space too large, you'll have a number of problems:

Your JSON file will probably be huge enough to crash your browser when the UI tries to load it.
The agent will learn very slowly because it needs to explore so many states.

So the trick is to find a way of encoding enough information about the game state without the number of states exploding. e.g. if you were to track the exact locations of Pacman and both ghosts, you already have 65 x 65 x 65 = 274,675 states to deal with.

Your state encoding should also make sense when combined with the reward function. For example, the environment gives a reward when Pacman eats food, so intuitively the state should track food in some way.

If your agent is struggling to win games, you could try:

Making the ghosts move more randomly by reducing their smartMoveProb
Making a smaller grid, maybe with only one ghost

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data/pacman		data/pacman
project		project
src/main/scala/rl		src/main/scala/rl
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
README.md		README.md
build-site.sh		build-site.sh
build.sbt		build.sbt
gridworld.html		gridworld.html
index.html		index.html
pacman.html		pacman.html
polecart-human.html		polecart-human.html
polecart-qlearning.html		polecart-qlearning.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning in Scala

Running locally

Pacman training

Hints

About

Releases

Packages

Languages

cb372/reinforcement-learning-in-scala

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning in Scala

Running locally

Pacman training

Hints

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages