Linear Tree Implementation (cog-imperial#108)

* added initial linear-tree files. Created linear-tree model class * initial commit * LinearTreeModel * public vs private test * private vs public testing * public vs private testing * Initial Linear Tree Commit * ltmodel testing * ltmodel testing * ltmodel testing * ltmodel testing * recursion test find children splits * testing * recursion test find all children splits * recursion test number find all children splits * Recursion test 1 * Globalizing Function Test * Globalizing Function Test * Determine if parent information is even needed * LinearModel Testing * Lt Model Testing and commit * Cleaning Up LinearTreeModel Class code * Linear Tree Init Commiit * Raise exceptions for missing bounds/wrong transformation * changed parse_tree_data name * raise errors on unsupported GDP transformations * Bounds changes * Implemented output variable bound calculation function * adding comments to output variable bound function * Initial Linear Tree Documentation * linear tree documentation * Added Notebook for Linear Tree in Docs * Documentation * Added more documentation on comments * Updated docstrings * docstring updates * Docstring Updates * Docstring Updates * Upload the script for testing * Upload script for testing LinearTreeModel Test dictionary * Upload the test for bigm transformation * Pass pytest: fix some bus, e.g. len * Test hull formulation * Test the slope and bounds: Note that bounds may be None for other dataset * Added Hybrid Big-M Formulations * Added Multiple BigM Test * Add more comments * Added multivariate input testing for linear model decision trees * Added Hybrid Big M Formulation Tests * Docstring Updates * Added option to pass in summary rather than linear-tree instance * Added test to ensure model summary argument functions * Added Hybrid Big-M Tests * docstring updates * Docstring updates * Docstring updates * Added code to ensure input dict is correct * Added hybrid big-m representation to docs * Added testing for raised exceptions * Reassigned none bounds in lt model. Added unscaled_input_bounds karg * Initial formulation consolidation code commit * docstring updates * Docstring updates * Docstring update * Update Variable Names * Updated LinearTreeModel to LinearTreeDefinition, made helper functions internal, and added cbc as solver for MILP tests * Updated Notebook to use LinearTree Definition * Code Cleanup * Updated lineartree to linear_tree, changed _setup_scaled_inputs due to maxpool int/float issue * Install pyscipopt in main.yml * Update linear_tree notebook to use SCIP rather than gurobi * Changed quadratic formulation solver from gurobi to scip * Skip if solvers unavailable * omlt.lineartree to omlt.linear_tree * Added 'custom' transformation option to LinearTreeGDPFormulation * Ran through pylint and black * Cleaning up for linting * Fixing pylint issues * Linting * Linting * Added test for scaling LinearTreeDefinition * Linting * Edit docstring * Added properties to definition. Docstring Updates. * Removing unused properties * Docstring Updates * Linear Tree Notebook Updates * Linting * docstring update * For code coverage * Addressing requested changes * Addressing changes * Addressing ruth comments * Notebook Update * Updating README.rst. Also updated OMLT paper citation. * Modifying citation * Notebook modification * Notebook modification * Notebook docstring * Attempt tests on Python 3.10 * Testing on Python 3.10 --------- Co-authored-by: Shumeng Lin <[email protected]>
chplate · Sep 18, 2023 · b60bf0d · b60bf0d
1 parent dcca13c
commit b60bf0d
Show file tree

Hide file tree

Showing 14 changed files with 3,120 additions and 14 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -15,7 +15,8 @@ jobs:
 
     strategy:
       matrix:
-        python-version: ["3.7", "3.8", "3.9"]
+        # python-version: ["3.7", "3.8", "3.9"]
+        python-version: ["3.8", "3.9", "3.10"]
 
     steps:
       - uses: "actions/checkout@v2"
@@ -35,6 +36,7 @@ jobs:
           python -m pip install --upgrade pip setuptools wheel
           python -m pip install --upgrade coverage[toml] virtualenv tox tox-gh-actions          
           conda install -c conda-forge ipopt
+          conda install -c conda-forge pyscipopt
 
       - name: "Run tox targets with lean testing environment for ${{ matrix.python-version }}"
         run: "tox -re leanenv"

diff --git a/README.rst b/README.rst
@@ -27,17 +27,32 @@ OMLT: Optimization and Machine Learning Toolkit
 
 OMLT is a Python package for representing machine learning models (neural networks and gradient-boosted trees) within the Pyomo optimization environment. The package provides various optimization formulations for machine learning models (such as full-space, reduced-space, and MILP) as well as an interface to import sequential Keras and general ONNX models.
 
-Please reference the `preprint <https://arxiv.org/abs/2202.02414>`_ of this software package as:
+Please reference the paper for this software package as:
 
 ::
 
-     @misc{ceccon2022omlt,
+     @article{ceccon2022omlt,
           title={OMLT: Optimization & Machine Learning Toolkit},
-          author={Ceccon, F. and Jalving, J. and Haddad, J. and Thebelt, A. and Tsay, C. and Laird, C. D. and Misener, R.},
-          year={2022},
-          eprint={2202.02414},
-          archivePrefix={arXiv},
-          primaryClass={stat.ML}
+          author={Ceccon, F. and Jalving, J. and Haddad, J. and Thebelt, A. and Tsay, C. and Laird, C. D and Misener, R.},
+          journal={Journal of Machine Learning Research},
+          volume={23},
+          number={349},
+          pages={1--8},
+          year={2022}
+     }
+
+When utilizing linear model decision trees, please cite the following paper in addition:
+
+::
+
+     @article{ammari2023,
+          title={Linear Model Decision Trees as Surrogates in Optimization of Engineering Applications},
+          author= {Bashar L. Ammari and Emma S. Johnson and Georgia Stinchfield and Taehun Kim and Michael Bynum and William E. Hart and Joshua Pulsipher and Carl D. Laird},
+          journal={Computers \& Chemical Engineering},
+          volume = {178},
+          year = {2023},
+          issn = {0098-1354},
+          doi = {https://doi.org/10.1016/j.compchemeng.2023.108347}
      }
 
 Documentation
@@ -152,6 +167,10 @@ Contributors
      - Alexander Thebelt
      - This work was supported by BASF SE, Ludwigshafen am Rhein.
 
+   * - |bammari|_
+     - Bashar L. Ammari
+     - This work was funded by Sandia National Laboratories, Laboratory Directed Research and Development program.
+
 
 .. _jalving: https://github.com/jalving
 .. |jalving| image:: https://avatars1.githubusercontent.com/u/16785413?s=120&v=4
@@ -172,3 +191,7 @@ Contributors
 .. _thebtron: https://github.com/ThebTron
 .. |thebtron| image:: https://avatars.githubusercontent.com/u/31448377?s=120&v=4
    :width: 80px
+
+.. _bammari: https://github.com/bammari
+.. |bammari| image:: https://avatars.githubusercontent.com/u/96192809?v=4
+   :width: 80px
diff --git a/docs/api_doc/omlt.linear_tree.rst b/docs/api_doc/omlt.linear_tree.rst
@@ -0,0 +1,20 @@
+Linear Model Decision Trees
+============================
+
+.. automodule:: omlt.linear_tree.__init__
+
+Linear Tree Definition
+-----------------------
+
+.. automodule:: omlt.linear_tree.lt_definition
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+Linear Tree Formulations
+-------------------------
+
+.. automodule:: omlt.linear_tree.lt_formulation
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/api_doc/omlt.rst b/docs/api_doc/omlt.rst
@@ -8,4 +8,5 @@ API Documentation
    omlt.scaling
    omlt.io
    omlt.gbt 
-   omlt.neuralnet
+   omlt.neuralnet
+   omlt.linear_tree
diff --git a/docs/notebooks.rst b/docs/notebooks.rst
@@ -18,4 +18,6 @@ github `page <https://github.com/cog-imperial/OMLT/tree/main/docs/notebooks/>`_.
 
 * `auto-thermal-reformer-relu.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/auto-thermal-reformer-relu.ipynb>`_ develops a neural network surrogate (using ReLU activations) with data from a process model built using `IDAES-PSE <https://github.com/IDAES/idaes-pse>`_.
 
-* `bo_with_trees.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/bo_with_trees.ipynb>`_ incorporates gradient-boosted-trees into a Bayesian optimization loop to optimize the Rosenbrock function.
+* `bo_with_trees.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/trees/bo_with_trees.ipynb>`_ incorporates gradient-boosted-trees into a Bayesian optimization loop to optimize the Rosenbrock function.
+
+* `linear_tree_formulations.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/trees/linear_tree_formulations.ipynb>`_ showcases the different linear model decision tree formulations available in OMLT.
diff --git a/docs/notebooks/bo_with_trees.ipynb → docs/notebooks/trees/bo_with_trees.ipynb b/docs/notebooks/bo_with_trees.ipynb → docs/notebooks/trees/bo_with_trees.ipynb
diff --git a/docs/notebooks/trees/linear_tree_formulations.ipynb b/docs/notebooks/trees/linear_tree_formulations.ipynb
diff --git a/setup.cfg b/setup.cfg
@@ -76,6 +76,7 @@ testing =
     ipywidgets
     jupyter
     lightgbm
+    linear-tree
     matplotlib
     pandas
     keras

diff --git a/src/omlt/dependencies.py b/src/omlt/dependencies.py
@@ -3,3 +3,5 @@
 # check for dependencies and create shortcut if available
 onnx, onnx_available = attempt_import("onnx")
 keras, keras_available = attempt_import("tensorflow.keras")
+
+lineartree, lineartree_available = attempt_import("lineartree")
diff --git a/src/omlt/linear_tree/__init__.py b/src/omlt/linear_tree/__init__.py
@@ -0,0 +1,24 @@
+r"""
+There are multiple formulations for representing linear model decision trees.
+
+Please see the following reference:
+    * Ammari et al. (2023) Linear Model Decision Trees as Surrogates in Optimization
+      of Engineering Applications. Computers & Chemical Engineering
+
+We utilize the following common nomenclature in the formulations:
+
+.. math::
+    \begin{align*}
+        L  &:= \text{Set of leaves} \\
+        z_{\ell} &:= \text{Binary variable indicating which leaf is selected} \\
+        x &:= \text{Vector of input variables to the decision tree}  \\
+        d &:= \text{Output variable from the decision tree} \\
+        a_{\ell} &:= \text{Vector of slopes learned by the tree for leaf } \ell \in L\\
+        b_{\ell} &:= \text{Bias term learned by the tree for leaf } \ell \in L\\
+    \end{align*}
+"""
+from omlt.linear_tree.lt_formulation import (
+    LinearTreeGDPFormulation,
+    LinearTreeHybridBigMFormulation,
+)
+from omlt.linear_tree.lt_definition import LinearTreeDefinition