-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Novelty Search #16
Comments
No Novelty Search in PyTorch-NEAT. Just follow the roadmap above. |
I found a few resources that have to do with loading the x-y coordinates from a Retro "movie" - |
Look at the sonicNEAT repo. I'm fairly sure it references the x coordinates at least. |
I think I've found an applicable solution. Examine ppo2ttifrutti_sonic_env.py's
The purpose of these calls is to load an environment that fits with the specifications assigned by all these classes.
If we write a command
We know that
From this, we can infer that there will be nothing added to In short, we may get the x and y coordinates if we simply call |
Should also note that I don't think y plays any role in how the reward is calculated. Seems to be much more of an x thing given |
This is all useful information in general, but for the sake of this particular issue we just want to focus on logging the x/y coordinates over time. Do a run where you log all x/y coordinates to a separate file for each episode. Then you can plot those files using Excel or gnuplot with each file in a different color, and see the movement paths in the level. This will be a good visual confirmation that this works. Once we have that, we can think about how to make a uniform behavior characterization for use by Novelty Search. |
I worked on this a little more and managed to get a solution dependent on the initial x and y coordinates working, but there were issues:
Seems as though the best course of action would be to run this on an agent that has been trained considerably (say, after 300 or so updates) and to declare a condition that prevents CSV files from being created at the first few timesteps (save for, of course, the initial one) |
You're currently plotting x and y separately with time implicitly on the x axis. What I want is a plot of (x,y). Basically, the plot should depict Sonic's travel path, and the line can move forward and backward. If you want to track time, you can have the color intensity of the line change across time, though this would just be a bonus. |
The behavior characterization developed here can be used in our code for #24 . Attention should be focused on that issue instead of this one ... we will implement Behavioral Diversity instead of Novelty Search. So, I'm going to close this issue. |
Looking at how Sonic fails and also reading up some of the reports on the various attempts at the Sonic domain, I really think that Novelty Search ( https://www.cs.ucf.edu/eplex/noveltysearch/userspage/ , https://arxiv.org/abs/1712.06560 ) could be beneficial in this domain. In particular, the reward is very deceptive. Sonic is rewarded for moving to the right, but many interesting levels require significant backtracking.
However, typical Novelty Search as implemented on small neural networks is not scalable. What we need is Novelty Search with Deep Networks, which is the code that we've already had trouble running on Windows: https://github.com/uber-research/deep-neuroevolution
However, there are many implementations of Novelty Search out there, and we may not need to use this one. In particular, all Novelty Search is really doing is changing the fitness function, but for Deep Neuroevolution to be successful, people typically use CMA-ES. In fact, one of the contest entrants used CMA-ES combined with a large variety of complicated other stuff:
https://github.com/dylandjian/retro-contest-sonic
If we remove the complicated World Models aspect from that code and just run the CMA-ES part, but then replace the fitness function with a Novelty based one, then we might create an agent that can move left sometimes.
Here are several bite-sized sub-tasks associated with this issue that I would like you to tackle:
The text was updated successfully, but these errors were encountered: