Planned Improvements.txt

Clean up info passing
	Actual class or at least abstract class
Have goal not in corner
	Maybe 1 or two away
	Agents wouldn't be scared off by hitting walls near goal
	Agents wouldn't be able to crawl along walls
Time
	Punish slightly for taking time
	Adjust exploration rate based on time, maybe time since positive reward
Organize Planned Improvements
const double correctionFactor for all agents
	That way rewards for each square only have to be relative to each other
Make default map actually default map
	Maybe only update map when scrollableList changes
	Maybe temporary solution
Database system for square types
Breadcrumbs that are removable
Let LevelEditor actually EDIT levels, instead of just creating them
Document file format rules
Implement better copying system for GameMap and LevelEditor
	Consider changing how one considers size of map (do mandatory edge wall count?)
Make sure 0 reward never matters
New, more interesting maps
Copy Minds
Model that uses past move or two to decide
	Make sure it knows whether it's hit a wall
Bring back SpeedDemon-like system
	Save changes as working model in part of memory
	Only update for real at end if better
Separate current situation (walls) from global training
Model with understanding of position
Model with understanding of current and surrounding squares
	Maybe use binary stuff before neural network
Model that can stay
	To help build towards Labyrinth AI
	To demonstrate wireheading
	Include in constructors allowed move
Model that knows opposites
	eg reinforcing right positively reinforces left negatively
	Seems to do that on its own, oddly enough, when it hits a wall, it already careens in other direction
Models that can be multiple of the different agents with only parameters changed
	Like ExponentialLearner is ExponentialWithDecay with decay=1
	Would clean up code substantially
	Maybe put agents in a subfolder?
	Really, only linear vs. exponential is major difference, the rest is minor
Model that can tune decay, correction factor, etc. based on how it's improving
	"Introspective"
"Game" that lets the player see what the thing agent sees
Visualize what is avaiable to get rewarded or punished
	Histograms for the directions
	Path (fading) along map
	Make comparison to automata
Two agents running side by side for easy comparison
Slider for animation speed
Checkpoints
	Can be datapoints: agent knows what last checkpoint was, can change behvaior accordingly
Better info display system
	Change toString to use scientific notation
	Implement toString in KnowsLastMove
	Maybe save good info in a file, so I don't have to run it every time
Pure punishment
Document properly
	Comments
	Modularity
	Fix strings in map (don't be lazy)
	Make softReset public and remove from reward
	Different classes for menu and game
	Redo checkMenu() in ReinforcementLearner to only reset if an actual change was occurred
Metric for success of different agents
	Expected value of model
	Time to reach optimal model
Handle overflows in RP2
Find interesting seeds
Pillbug demonstration
	Give orientation
	Make boxes so small as to appear continuous
	Maybe actually make continuous
Add reset for all agents
Make menu look better
Ecosystem? See notebook
Multiple agents?
Multiple starts? Disorientation, prevent memorization