presentation.Rmd

---
title: "Interpreting Machine Learning: Bigger on the inside"
author: "Frankie Logan"
date: "10-1-2018"
output:
  xaringan::moon_reader:
    css: ["default", "default-fonts", "hygge"]
    lib_dir: dist/libs
    nature:
      highlightStyle: github
      highlightLines: true
      countIncrementalSlides: false
      ratio: "16:9"
---
```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
```

class: top, left

# Motivation

- Machine Learning (ML) models are wonderful tools that can help us tackle a wide variety of problem. However, the more
sopisticated they become, the more obscure they become to the end users. 

- Too obscure for many business cases?  


```{r echo = FALSE}
knitr::include_graphics("img/black_box.jpeg")
```
---
class: top, left
# White Box model vs Black Box model
--

.pull-left[
```{r echo=FALSE}
head(broom::tidy(glm1) %>% select(term, estimate, p.value), 10)

```
]
--
.pull-right[

```{r echo = FALSE}
knitr::include_graphics("img/black_box_lol.jpeg")
```
]
---
class: top, left
# Avaliable Tools (Not Exhaustive)

In recent years, a lot of tools have been developed to improve the interpretability of ML models:

--

* [DALEX](https://github.com/pbiecek/DALEX)

* [lime](https://github.com/thomasp85/lime)

* [live](https://github.com/MI2DataLab/live)

* [pdp](https://github.com/bgreenwell/pdp)

* etc...

---
class: top, left
# DALEX: Descriptive mAchine Learning EXplanations

[DALEX](https://github.com/pbiecek/DALEX) (Biecek 2018) is an [R](https://www.r-project.org/) package created by Przemyslaw Biecek that provides users with tools to "unbox" all the wonderful ML models

What makes DALEX so special?

- Wide array of diagnostic capabilities packed into one
- Adaptability
- [ggplot2](https://ggplot2.tidyverse.org/)

Additional Resource: [https://pbiecek.github.io/DALEX_docs/](https://pbiecek.github.io/DALEX_docs/)

---

class: inverse, middle, center
# Let's see how it works in practice!
---
class: inverse, middle center

# Shock Lapse in Post Level Term

---
class: top, left
# Background/housekeeping

Data: [SOA Post Level Term Lapse Study](https://www.soa.org/experience-studies/2014/research-2014-post-level-shock/)


Predictors: Gender, Issue Age, Face Amount, Post Level Premium Structure, Premium Jump Ratio, Risk Class, Premium Mode


Responses: Lapse Count Rate

---
class: top left
# Data/Model Prepping

Explain function:

```{r eval=FALSE, echo=TRUE}

explain(model, data, y, predict_function, label)

```
--
Customize your predict function

```{r eval=FALSE, echo=TRUE}
custom_predict_h2o <- function(model, newdata)  {
  newdata_h2o <- as.h2o(newdata)
  res <- as.data.frame(h2o.predict(model, newdata_h2o))
  return(as.numeric(res$predict))
}
```
--
e.g.

```{r eval=FALSE, echo=TRUE}
explainer_h2o_rf <- explain(
  model = samplemodel,
  data = select(validation_h2o, predictors),
  y = validation_h2o$lapse_count_rate,
  predict_function = custom_predict_h2o,
  label = "Random Forest (H2O)")
```
---
class: top left
# Model Performance (model_performance())


```{r out.height = 500, echo=FALSE}
knitr::include_graphics("img/mp_all.jpeg")
```


---
class: top, left
# Model Performance (model_performance())

```{r out.height = 500, echo = FALSE}
knitr::include_graphics("img/mp_box_all_outliers_fixed.jpeg")
```


---

class: top, left

# Variable Importance (variable_importance())

Which variable are the most important in your model? 

--

```{r eval = FALSE, echo=TRUE}
vi_nn <- variable_importance(explainer_nn, type = "ratio", n_sample = -1)
plot(vi_xgb, vi_glm1, vi_nn, vi_h2o)

```

```{r out.height = 400, echo = FALSE}

knitr::include_graphics("img/vi_all.jpeg")

```

---
class: top, left

#Merging Path Plot (variable_response())

What is the relationship between the variable and the prediction?

--

.pull-left[
```{r eval = FALSE, echo = TRUE}
mpp_xgb <- variable_response(explainer = explainer_xgb, 
                             variable = "risk_class", 
                             type = "factor")

plot(mpp_xgb)
```

Merging Path Plot utilizes [factorMerger](https://mi2datalab.github.io/factorMerger/) and is one of the three options used for investigating relationship between a single variable to the predictors.  
  
  * For continuous variable, change `type` argument to [pdp](https://github.com/bgreenwell/pdp) or [ale](https://cran.r-project.org/web/packages/ALEPlot/index.html)


Additional resources:

* [pdp paper](https://journal.r-project.org/archive/2017/RJ-2017-016/RJ-2017-016.pdf)
* [Merging Path Plot Paper](https://arxiv.org/abs/1709.04412)

]
.pull-right[

```{r echo = FALSE}
knitr::include_graphics("img/mpp_xgb.jpeg")

```

]
---
class: top, left

# What is the exact effect of each variables?

Let's look back at a basic GLM example

```{r echo=FALSE}
head(broom::tidy(glm1), 10)

```

---
class: top, left

<style type="text/css">
code.r{
  font-size: 14px;
}
</style>

# Single Prediction (prediction_breakdown())

--
.pull-left[

`prediction_breakdown` uses [breakDown](https://pbiecek.github.io/breakDown/) package as a base
  * caveat: It doesn't handle models with too many interactions term well!
  
```{r eval = FALSE, echo = TRUE}
newdata <- validation[25,] %>% 
  select(predictors)

pb_xgb <- prediction_breakdown(explainer = explainer_xgb, 
                              observation = newdata)

plot(pb_xgb)
```

Additional resources:

* [live and breakDown Paper](https://arxiv.org/abs/1804.01955)

]
--
.pull-right[

```{r echo = FALSE}
knitr::include_graphics("img/xgb_sp.jpeg")
```

]

---
class: top, left

<style type="text/css">
code.r{
  font-size: 14px;
}
</style>

# Model Prediction Comparison
<small>Let's look at all the models side by side</small>

```{r fig.align = "center", echo = FALSE}
knitr::include_graphics("img/sp_3.jpeg")

```


---

class: top, left
<style type="text/css">
body{
  font-size: 14px;
}
</style>
#Reference

<small>Biecek, Przemyslaw. DALEX. Descriptive MAchine Learning EXplanations, 2018, https://pbiecek.github.io/DALEX/.

Biecek Przemyslaw. (2018). DALEX: explainers for complex predictive models. ArXiv e-prints. 1806.08915, https://arxiv.org/abs/1806.08915.

Biecek, Przemyslaw. DALEX: Descriptive MAchine Learning EXplanations. Descriptive MAchine Learning EXplanations DALEX, 11 Aug. 2018, https://pbiecek.github.io/DALEX_docs/.

Greenwell, Brandon. A General Framework for Constructing Partial Dependence (I.e., Marginal Effect) Plots from Various Types Machine Learning Models in R. GitHub, 25 Sept. 2016, https://github.com/bgreenwell/pdp.

Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, http://ggplot2.org.

Pedersen, Thomas. thomasp85/Lime. GitHub, 2017, https://github.com/thomasp85/lime.

Sitko A and Biecek P (2017). The Merging Path Plot: adaptive fusing of k-groups with likelihood-based model selection. https://arxiv.org/abs/1709.04412.

Staniak M, Biecek P (2018). Explanations of Model Predictions with live and breakDown Packages. ArXiv e-prints. 1804.01955, https://arxiv.org/abs/1804.01955.

Staniak, Mateusz, and Biecek Przemyslaw. Live: Local Interpretable (Model-Agnostic) Visual Explanations. GitHub, 2017, https://github.com/MI2DataLab/live.</small>

---

class: top, left

#Questions?