Akash Kwatra and Lucas Kabela
Using reinforcement learning for the game of hearts. <3
This Opengym AI environment is a slightly modified version of https://github.com/zmcx16/OpenAI-Gym-Hearts
The following base packages are required to run the repository:
- Python - 3.6+
- Gym - 0.15.4+
- Numpy - 1.16.4+
- PyTorch - Lastest (1.2.0+)
- TensorBoard - 2.0+
- TQDM - 4.32.1+
This repository contains code for running repeatable experiments on the utility of various subsets of raw feature state for value approximation. We have provided 4 notebooks to run repeatable expirements for this environment:
-
run_hearts.ipynb Executable for the grader / reader to run, this will showcase the performance of our best models. Run this from the root directory of the repo -- jupyter notebook run_hearts.ipynb. Make sure requirements are installed. May be easiest on a conda installation.
-
simple.ipynb trains and tests linear value function approximation using monte carlo rollouts
-
mlp.ipynb trains and tests non linear value function approximation with a Neural Network using monte carlo rollouts with a configurable feature set
-
reinforce.ipynb trains and test a multi layer perception network using the policy gradient algorithm REINFORCE.
.
├── gymhearts # Agents and Environment
| ├── Agent
| ├── Hearts
|
├── model_zoo # Saved trained models from experiments
| ├── feature_study_models
| ├── linear_v_nonlinear_models
| ├── policy_grad_models
|
├── writeup # Paper and data from expirements
├── LICENSE
└── README.md
Contains the OpenAI gym environment code as well as the logic for various agents implemented
Agent folder contains a variety of agents for playing the game of Hearts, includes human players, linear value approximation agent, nonlinear value apporximation agent, policy gradient agents, and a random agent which serves as a baseline for comparisson and training.
Hearts contains the code for the game environment (see https://github.com/zmcx16/OpenAI-Gym-Hearts), with minor modifications to environment rendering and valid moves.
Directory containing the saved models from experiments. These can be loaded and evaluated using the notebooks provided
This directory contains models from the study on combinations of raw features and agent performance. The numbers following file names correspond to the different features used, with these numbers corresponding to:
[in_hand, in_play, played_cards, won_cards, scores]
using one based indexing.
These models correspond to the models trained in using only in_hand set of raw features for both a linear (simple dot product) and nonlinear (neural network). This folder contains the weights for the linear model, as well as the saved model for the neural network, trained for 10_000 epochs
This directory store the models trained using the REINFORCE algorithm, following the naming convention in feature_study_models.
This directory stores the models from our learning rate study of the REINFORCE with baseline method.
Stores the best model from our simple mc function approximator
This folder contains a report on the construction and findings of our experiments
This project is licensed under the terms of the MIT license.