Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add the general model for best of three game #78

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wanjeans33
Copy link

林哥你好,

小老弟是你的老粉丝了,这次用了你的项目做了个小组作业,感谢你的分享!

在你的基础上我做了一些尝试,试图训练一个可以稳定打赢整场的AI。我先试图用随机的第一和第二局来训练,但是结果不是很理想。可能是由于敌人起始变化太大,最后模型(10m-steps)结果达到胜率58%。

随后尝试了使用整个三局两胜进行训练,重构了steps中的done条件。加入了self.jump和self.round_end,用于跳过过场和记录round是否结束。在经过了大致5m steps后reward基本收敛。测试以后达到了98%的胜率。非常令人激动!!

以下是我的tensorboard训练结果蓝线是random训练结果,紫线是entire match训练结果
train_result_tensorboard

我pull了general(三局两胜)的代码与结果,random的方法我再尝试通过调试reward function 获得更快的学习速率暂时就不上传了,希望能够pull我的结果给大家一起分享。

祝一切安好!
Jing WANG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant