You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for you implementation, I have a question:
Aren't tournament plays always the same as you set tau to 0 (so fully deterministic play) and each episode reset MCTS tree in playMatches function's for loop? It means that each time players start with zero knowledge (empty tree), do deterministic search and then act deterministically. No stochasticity, then games should look exactly the same. Do I miss something?
I also wonder how DeepMind did it in original paper (they player 400 episodes in tournament), do you have some insight? Guys from Stanford doesn't reset MCTS tree between tournament games, so players increment their knowledge and hence (possibly) play different more informed games each time. This makes sens.
Thanks for your attention!
The text was updated successfully, but these errors were encountered:
Hi!
Thanks for you implementation, I have a question:
Aren't tournament plays always the same as you set tau to 0 (so fully deterministic play) and each episode reset MCTS tree in
playMatches
function's for loop? It means that each time players start with zero knowledge (empty tree), do deterministic search and then act deterministically. No stochasticity, then games should look exactly the same. Do I miss something?I also wonder how DeepMind did it in original paper (they player 400 episodes in tournament), do you have some insight? Guys from Stanford doesn't reset MCTS tree between tournament games, so players increment their knowledge and hence (possibly) play different more informed games each time. This makes sens.
Thanks for your attention!
The text was updated successfully, but these errors were encountered: