Skip to content

Latest commit

 

History

History
143 lines (112 loc) · 5.29 KB

README.rst

File metadata and controls

143 lines (112 loc) · 5.29 KB

Markov Decision Process (MDP) Toolbox for Python

License

The MDP toolbox provides classes and functions for the resolution of descrete-time Markov Decision Processes. The list of algorithms that have been implemented includes backwards induction, linear programming, policy iteration, q-learning and value iteration along with several variations.

(Note: I've made some modifications and extensions to this to fix a couple of bugs, and add MDP visualization capabilities - Andrew Rollings)

The classes and functions were developped based on the MATLAB MDP toolbox by the Biometry and Artificial Intelligence Unit of INRA Toulouse (France). There are editions available for MATLAB, GNU Octave, Scilab and R. The suite of MDP toolboxes are described in Chades I, Chapron G, Cros M-J, Garcia F & Sabbadin R (2014) 'MDPtoolbox: a multi-platform toolbox to solve stochastic dynamic programming problems', Ecography, vol. 37, no. 9, pp. 916–920, doi 10.1111/ecog.00888.

Features

  • Eight MDP algorithms implemented
  • Fast array manipulation using NumPy
  • Full sparse matrix support using SciPy's sparse package
  • Optional linear programming support using cvxopt

PLEASE NOTE: the linear programming algorithm is currently unavailable except for testing purposes due to incorrect behaviour.

Installation

NumPy and SciPy must be on your system to use this toolbox. Please have a look at their documentation to get them installed. If you are installing onto Ubuntu or Debian and using Python 2 then this will pull in all the dependencies:

sudo apt-get install python-numpy python-scipy python-cvxopt

On the other hand, if you are using Python 3 then cvxopt will have to be compiled (pip will do it automatically). To get NumPy, SciPy and all the dependencies to have a fully featured cvxopt then run:

sudo apt-get install python3-numpy python3-scipy liblapack-dev libatlas-base-dev libgsl0-dev fftw-dev libglpk-dev libdsdp-dev

The two main ways of downloading the package is either from the Python Package Index or from GitHub. Both of these are explained below.

Python Package Index (PyPI)

Downloads Latest Version Development Status Wheel Status Egg Status Download format

The toolbox's PyPI page is https://pypi.python.org/pypi/pymdptoolbox/ and there are both zip and tar.gz archive options available that can be downloaded. However, I recommend using pip to install the toolbox if you have it available. Just type

pip install mdptoolbox-hiive

at the console and it should take care of downloading and installing everything for you.

GitHub

Clone the Git repository

git clone https://github.com/hiive/hiivemdptoolbox.git

and then follow from step two above. To learn how to use Git then I reccomend reading the freely available Pro Git book written by Scott Chacon and Ben Straub and published by Apress.

Quick Use

Start Python in your favourite way. The following example shows you how to import the module, set up an example Markov decision problem using a discount value of 0.9, solve it using the value iteration algorithm, and then check the optimal policy.

import mdptoolbox.example
P, R = mdptoolbox.example.forest()
vi = mdptoolbox.mdp.ValueIteration(P, R, 0.9)
vi.run()
vi.policy # result is (0, 0, 0)

Documentation

Documentation is available at http://pymdptoolbox.readthedocs.org/ and also as docstrings in the module code. If you use IPython to work with the toolbox, then you can view the docstrings by using a question mark ?. For example:

import mdptoolbox
mdptoolbox?<ENTER>
mdptoolbox.mdp?<ENTER>
mdptoolbox.mdp.ValueIteration?<ENTER>

will display the relevant documentation.

Contribute

Issue Tracker: https://github.com/sawcordwell/pymdptoolbox/issues

Source Code: https://github.com/sawcordwell/pymdptoolbox

Support

Use the issue tracker.

License

The project is licensed under the BSD license. See LICENSE.txt for details.