Contributor:
Zeming Fang, [email protected]
Ru-Yuan Zhang, [email protected]
Jieying Zhang,
- Dynamic programming
- Value iteration
- Policy iteraction
- Temporal difference learning
- TD-learning
- DQN
- Model-based planning
- Monte Carlo Tree Cearch (MCTS)
- Policy gradient
- REINFORCE