Basic info This is essentially a bin packing problem environment. It is in it's early stages of development but is fully operational.
There are two python scripts in this repistory:
- environment.py : This is the env which the agent exists in and implements selected actions and calculates the reward earned.
- qlearn.py : This is the Q-Learning algorithm and handles the acquisition of the calculation of the state-action values.
TODO:
- Enable larger state spaces.
- Allow for dynamic figuring out of actions.
- Find what the best learning rate and discount factor are for this problem.
- Add prioritisation in.
- Add time-awareness.