Skip to content

Commit

Permalink
Merge pull request #62 from sdu-cfei/issue_57_parallelize
Browse files Browse the repository at this point in the history
Parallelize GA + add FMPy
  • Loading branch information
filokot authored Oct 22, 2020
2 parents b2e0ac7 + b6b70ef commit d576cfd
Show file tree
Hide file tree
Showing 54 changed files with 1,840 additions and 1,335 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
venv/
.idea/
*.ipynb
*.pyc
Expand All @@ -6,6 +7,7 @@
**/workdir/**
build/
.vscode/
*.swp

# Setuptools distribution folder.
dist/
Expand Down
21 changes: 13 additions & 8 deletions CHANGES.txt
Original file line number Diff line number Diff line change
@@ -1,36 +1,41 @@
Changes in v. 0.0.9:
Changes in v.0.1
====================
- parallel genetic algorithm added (based on modestga)
- FMPy instead of pyFMI

Changes in v.0.0.9:
====================
- it is possible now to estimate just 1 parameter (fixed bug in plot_pop_evo())

Changes in v. 0.0.8:
Changes in v.0.0.8:
====================
- Version used in the ModestPy paper
- Added interface to SciPy algorithms

Changes in v. 0.0.7:
Changes in v.0.0.7:
====================
- added SQP method
- modified interface of the Estimation class to facilitate multi-algorithm pipelines

Changes in v. 0.0.6:
Changes in v.0.0.6:
====================
- LHS initialization of GA
- random seed
- many small bug fixes

Changes in v. 0.0.5:
Changes in v.0.0.5:
====================
- Decreased tolerance of CVode solver in PyFMI

Changes in v. 0.0.4:
Changes in v.0.0.4:
====================
- New pattern search plot (parameter evolution) added to Estimation.py
- GA/PS default parameters tuned

Changes in v. 0.0.3:
Changes in v.0.0.3:
====================
- Tolerance criteria for GA and PS exposed in the Estimation API.

Changes in v. 0.0.2:
Changes in v.0.0.2:
====================
- Estimation class imported directly in __init__.py to allow imports like "from modestpy import Estimation".
85 changes: 28 additions & 57 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,67 +14,33 @@ Features:

- combination of global and local search methods (genetic algorithm, pattern search, truncated Newton method, L-BFGS-B, sequential least squares),
- suitable also for non-continuous and non-differentiable models,
- compatible with both Python 2.7 and 3 (tested up to 3.5).
- scalable to multiple cores (genetic algorithm from `modestga <https://github.com/krzysztofarendt/modestga>`_),
- Python 3.

Installation with conda (recommended)
-------------------------------------
Installation with pip (recommended)
-----------------------------------

It is now possible to install ModestPy through ``conda``:
It is now possible install ModestPy with a single command:

::

conda config --add channels conda-forge
conda install modestpy

Installation with conda and pip
-------------------------------

This procedure has been tested on Debian 9 and Ubuntu 16.04 with Python 3.

It is advised to use ``conda`` to install the required dependencies.
``modestpy`` itself can be installed using ``pip`` inside the ``conda`` environment.

Create separate environment (optional):

::

conda create --name modestpy
conda activate modestpy

Install dependencies:

::
pip install modestpy

conda install scipy pandas numpy matplotlib
conda install -c chria pyfmi
conda install -c conda-forge pydoe

Install ``modestpy``:
Alternatively:

::

python -m pip install modestpy

Installation with pip
---------------------

This procedure has been tested on Windows 7 with Python 2.

Install ``pyfmi`` as part of `JModelica <http://www.jmodelica.org/>`__.

To install ``modestpy`` use ``pip`` (other dependencies will be installed automatically):

::
pip install https://github.com/sdu-cfei/modest-py/archive/master.zip

python -m pip install modestpy
Installation with conda
-----------------------

To get the latest development version download directly from GitHub repository:
Conda is installation is less frequently tested, but should work:

::

python -m pip install https://github.com/sdu-cfei/modest-py/archive/master.zip

Note, that JModelica installs Python and libraries in a separate directory than the standard Python distribution. Therefore either the path to those libraries needs to be added to PYTHONPATH or ModestPy needs to be installed inside the JModelica distribution.
conda config --add channels conda-forge
conda install modestpy

Test your installation
----------------------
Expand All @@ -98,20 +64,25 @@ Usage
-----

Users are supposed to call only the high level API included in
``modestpy.Estimation``. The API is fully discussed in `this
wiki <https://github.com/sdu-cfei/modest-py/wiki/modestpy-API>`__. You
can also check out this `simple example </examples/simple>`__. The basic
usage is as follows:
``modestpy.Estimation``. The API is fully discussed in the `docs <docs/documentation.md>`__.
You can also check out this `simple example </examples/simple>`__.
The basic usage is as follows:

.. code:: python
>>> from modestpy import Estimation
>>> session = Estimation(workdir, fmu_path, inp, known, est, ideal)
>>> estimates = session.estimate()
>>> err, res = session.validate()
from modestpy import Estimation
if __name__ == "__main__":
session = Estimation(workdir, fmu_path, inp, known, est, ideal)
estimates = session.estimate()
err, res = session.validate()
More control is possible via optional arguments, as discussed in the `documentation
<docs/documentation.md>`__.

More control is possible via optional arguments, as discussed in the `documentation
<https://github.com/sdu-cfei/modest-py/wiki/modestpy-API>`__.
The ``if __name__ == "__main__":`` wrapper is needed on Windows, because ``modestpy``
relies on ``multiprocessing``. You can find more explanation on why this is needed
`here <https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming>`__.

``modestpy`` automatically saves results in the working
directory including csv files with estimates and some useful plots,
Expand Down
5 changes: 4 additions & 1 deletion bin/test.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
#!/usr/bin/env python
from modestpy.test import run
from modestpy.loginit import config_logger

run.tests()
if __name__ == "__main__":
config_logger(filename='unit_tests.log', level='DEBUG')
run.tests()
161 changes: 161 additions & 0 deletions docs/documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# modestpy
## Introduction

Users are supposed to use only `modestpy.Estimation` class and its two
methods `estimate()` and `validate()`. The class defines a single interface
for different optimization algorithms. Currently, the available algorithms are:
- parallel genetic algorithm (MODESTGA) - recommended,
- legacy single-process genetic algorithm (GA),
- pattern search (PS),
- SciPy solvers (e.g. 'TNC', 'L-BFGS-B', 'SLSQP').

The methods can be used in a sequence, e.g. MODESTGA+PS (default),
using the argument `methods`. All estimation settings are set during instantiation.
Results of estimation and validation are saved in the working directory `workdir`
(it must exist).

## Learn by examples

First define the following variables:

* `workdir` (`str`) - path to the working directory (it must exist)
* `fmu_path` (`str`) - path to the FMU compiled for your platform
* `inp` (`pandas.DataFrame`) - inputs, index given in seconds and named `time`
* `est` (`dict(str : tuple(float, float, float))`) - dictionary mapping parameter names to tuples (initial guess, lower bound, upper bound)
* `known` (`dict(str : float)`) - dictionary mapping parameter names to known values
* `ideal` (`pandas.DataFrame`) - ideal solution (usually measurements), index given in seconds and named `time`

Indexes of `inp` and `ideal` must be equal, i.e. `inp.index == ideal.index` must be `True`.
Columns in `inp` and `ideal` must have the same names as model inputs and outputs, respectively.
All model inputs must be present in `inp`, but only chosen outputs may be included in `ideal`.
Data for each variable present in `ideal` are used to calculate the error function that is minimized by **modestpy**.

Now the parameters can be estimated using default settings:

```
python
>>> session = Estimation(workdir, fmu_path, inp, known, est, ideal)
>>> estimates = session.estimate() # Returns dict(str: float)
>>> err, res = session.validate() # Returns tuple(dict(str: float), pandas.DataFrame)
```

All results are also saved in `workdir`.

By default all data from `inp` and `ideal` (all rows) are used in both estimation and validation.
To slice the data into separate learning and validation periods, additional arguments need to be defined:

* `lp_n` (`int`) - number of learning periods, randomly selected within `lp_frame`
* `lp_len` (`float`) - length of single learning period
* `lp_frame` (`tuple(float, float)`) - beginning and end of learning time frame
* `vp` (`tuple(float, float)`) - validation period

Often model parameters are used to define the initial conditions in the model,
in example initial temperature. The initial values have to be read from the measured data stored in `ideal`.
You can do this with the optional argument `ic_param`:

* `ic_param` (`dict(str : str)`) - maps model parameters to column names in `ideal`

Estimation algorithms (MODESTGA, PS, SQP) can be tuned by overwriting specific keys in `modestga_opts`, `ps_opts` and `scipy_opts`.
The default options are:

```
# Default MODESTGA options
MODESTGA_OPTS = {
'workers': 3, # CPU cores to use
'generations': 50, # Max. number of generations
'pop_size': 30, # Population size
'mut_rate': 0.01, # Mutation rate
'trm_size': 3, # Tournament size
'tol': 1e-3, # Solution tolerance
'inertia': 100, # Max. number of non-improving generations
'ftype': 'RMSE'
}
# Default PS options
self.PS_OPTS = {
'maxiter': 500,
'rel_step': 0.02,
'tol': 1e-11,
'try_lim': 1000,
'ftype': 'RMSE'
}
# Default SCIPY options
SCIPY_OPTS = {
'solver': 'L-BFGS-B',
'options': {'disp': True,
'iprint': 2,
'maxiter': 150,
'full_output': True},
'ftype': 'RMSE'
}
```

## Docstrings

```python
class Estimation(object):
"""Public interface of `modestpy`.
Index in DataFrames `inp` and `ideal` must be named 'time'
and given in seconds. The index name assertion check is
implemented to avoid situations in which a user reads DataFrame
from a csv and forgets to use `DataFrame.set_index(column_name)`
(it happens quite often...).
Currently available estimation methods:
- MODESTGA - parallel genetic algorithm (default GA in modestpy)
- GA_LEGACY - single-process genetic algorithm (legacy implementation, discouraged)
- PS - pattern search (Hooke-Jeeves)
- SCIPY - interface to algorithms available through
scipy.optimize.minimize()
Parameters:
-----------
workdir: str
Output directory, must exist
fmu_path: str
Absolute path to the FMU
inp: pandas.DataFrame
Input data, index given in seconds and named 'time'
known: dict(str: float)
Dictionary with known parameters (`parameter_name: value`)
est: dict(str: tuple(float, float, float))
Dictionary defining estimated parameters,
(`par_name: (guess value, lo limit, hi limit)`)
ideal: pandas.DataFrame
Ideal solution (usually measurements),
index in seconds and named `time`
lp_n: int or None
Number of learning periods, one if `None`
lp_len: float or None
Length of a single learning period, entire `lp_frame` if `None`
lp_frame: tuple of floats or None
Learning period time frame, entire data set if `None`
vp: tuple(float, float) or None
Validation period, entire data set if `None`
ic_param: dict(str, str) or None
Mapping between model parameters used for IC and variables from
`ideal`
methods: tuple(str, str)
List of methods to be used in the pipeline
ga_opts: dict
Genetic algorithm options
ps_opts: dict
Pattern search options
scipy_opts: dict
SciPy solver options
ftype: string
Cost function type. Currently 'NRMSE' (advised for multi-objective
estimation) or 'RMSE'.
seed: None or int
Random number seed. If None, current time or OS specific
randomness is used.
default_log: bool
If true, use default logging settings. Use false if you want to
use own logging.
logfile: str
If default_log=True, this argument can be used to specify the log
file name
"""
```
3 changes: 3 additions & 0 deletions examples/lin/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The charts in `showcase/` show the behavior of GA and PS when the cost function is convex. The charts were generated by an finding the parameters of the model `resources/lin_model.mo`, but the Python code used to generate these charts is no longer here.

See `examples/simple/` for an example with code.
Loading

0 comments on commit d576cfd

Please sign in to comment.