Fooling LIME and SHAP

Post-hoc explanation techniques that rely on input pertubations, such as LIME and SHAP, are not reliable towards systematic errors and underlying biases. In this project, the scaffolding technique from Slack et al. should be re-implemented, which effectively should hide the biases of any given classifier.

Paper Reference: https://arxiv.org/abs/1911.02508
Code Reference: https://github.com/dylan-slack/Fooling-LIME-SHAP

Installation

Clone the repository

git clone https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon.git

Create the environment

cd iml-ws21-projects-fool_the_lemon
conda env create -f "environment.yml"
conda activate iML-project

Run experiments

To run the experiments notebooks start a jupyterlab server.

How to install jupyterlab: https://github.com/jupyterlab/jupyterlab
```
jupyter-lab .
```
The seed for the experiments can be changed. For this, only the seed at the beginning of the notebook has to be changed.

Experiments

Reproduction (10)

Implement the approach by writting a simple interface/framework and confirm yiur implementation by using any (tabular) raciscm dataset (e.g. Boston Housing)

Reproduction with boston housing dataset

Extension (10)

Additionally to LIME and SHAP, incoporate PDP and analyse if it is fool-able, too.

PDP with boston housing dataset

Analysis (5)

Use different perturbation approaches and compare the impact on being fooled.

Different perturbation approaches with boston housing dataset

Hyperparameter Sensitivity (10)

Analyze the impact of the hyperparameters of LIME and SHAP (e.g., hyperparameters of the local model and of the pertubation algorithms).

Hyperparameter sensitivity LIME
Hyperparameter sensitivity SHAP

New Datasets (5)

Find at least two further (tabular) datasets with a risk of discrimination (that are not mentioned in the paper and study the impact of fooling on them.

Gender discrimination dataset
Heart failure prediction dataset

Datasets

Boson housing dataset
Gender discrimination dataset
Heart failure prediction dataset

Limitations / Further improvement

The current framework can only deal with regression and binary classification tasks
Only one biased input feature can get hidden
Only numerical features are considered
Currently, 3 perturbation algorithms are implemented

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Fooling LIME and SHAP

Installation

Experiments

Reproduction (10)

Extension (10)

Analysis (5)

Hyperparameter Sensitivity (10)

New Datasets (5)

Datasets

Limitations / Further improvement

Files

README.md

Latest commit

History

README.md

File metadata and controls

Fooling LIME and SHAP

Installation

Experiments

Reproduction (10)

Extension (10)

Analysis (5)

Hyperparameter Sensitivity (10)

New Datasets (5)

Datasets

Limitations / Further improvement