Post-hoc explanation techniques that rely on input pertubations, such as LIME and SHAP, are not reliable towards systematic errors and underlying biases. In this project, the scaffolding technique from Slack et al. should be re-implemented, which effectively should hide the biases of any given classifier.
- Paper Reference: https://arxiv.org/abs/1911.02508
- Code Reference: https://github.com/dylan-slack/Fooling-LIME-SHAP
-
Clone the repository
git clone https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon.git
-
Create the environment
cd iml-ws21-projects-fool_the_lemon conda env create -f "environment.yml" conda activate iML-project
-
Run experiments
To run the experiments notebooks start a jupyterlab server.
How to install jupyterlab: https://github.com/jupyterlab/jupyterlab
jupyter-lab .
The seed for the experiments can be changed. For this, only the seed at the beginning of the notebook has to be changed.
Implement the approach by writting a simple interface/framework and confirm yiur implementation by using any (tabular) raciscm dataset (e.g. Boston Housing)
Additionally to LIME and SHAP, incoporate PDP and analyse if it is fool-able, too.
Use different perturbation approaches and compare the impact on being fooled.
Analyze the impact of the hyperparameters of LIME and SHAP (e.g., hyperparameters of the local model and of the pertubation algorithms).
Find at least two further (tabular) datasets with a risk of discrimination (that are not mentioned in the paper and study the impact of fooling on them.
- The current framework can only deal with regression and binary classification tasks
- Only one biased input feature can get hidden
- Only numerical features are considered
- Currently, 3 perturbation algorithms are implemented