The reward function is very simple now, just winning or loosing.
Additional reward signals are stored in the info dict and can be used.
Initialization conditions changed with respect to Version 1.
The reward function is very simple now, just winning or loosing.
Additional reward signals are stored in the info dict and can be used.
Initialization conditions changed with respect to Version 1.