Arcade-style rewards system #27

bzier · 2017-11-12T04:03:34Z

Idea for an alternate rewards system:

Old school arcade game style checkpoint system

Every step is still -1 reward

any positive reward for steps will likely encourage delays rather than shortest/fastest path - e.g. driving around aimlessly to accumulate reward
Agent only has x steps to play (some reasonably small number)
Hitting a checkpoint extends play for another y steps
Checkpoints also grant a 'large' sum of reward points

note that the checkpoint reward needs to be enough to make progress worthwhile; if an episode is lengthened, more steps result in a lower total reward; the checkpoints must offset this or the agent may learn to maximize reward by simply avoiding the first checkpoint and never extending an episode

In theory, this system should/could:

reduce overall episode lengths (by terminating early if no progress is being made), which allows for shorter/faster iterations (i.e. fail fast)
reduce getting stuck in any one place for too long
prevent driving backwards too far
still encourage forward progress with checkpoints as before

The text was updated successfully, but these errors were encountered:

bzier · 2021-02-21T22:26:18Z

Depends on #26. I would like to see the reward functions injectable/pluggable to facilitate experimentation with all sorts of variations. Before working on this implementation, there should be a way to easy swap out which reward function should be used.

bzier added area/game-mario-kart suggestion / idea reward-function labels Nov 12, 2017

bzier changed the title ~~Alternate rewards system~~ Arcade-style rewards system Nov 16, 2017

bzier mentioned this issue Nov 17, 2017

Rewards for Progress #29

Closed

bzier added this to the Milestone 4 - Extras milestone Feb 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arcade-style rewards system #27

Arcade-style rewards system #27

bzier commented Nov 12, 2017

bzier commented Feb 21, 2021

Arcade-style rewards system #27

Arcade-style rewards system #27

Comments

bzier commented Nov 12, 2017

bzier commented Feb 21, 2021