Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback #1

Open
wants to merge 31 commits into
base: feedback
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7002322
Setting up GitHub Classroom Feedback
github-classroom[bot] Jan 8, 2022
cfec701
updated .gitignore
MarcSpeckmann Jan 8, 2022
8d59d9a
add enviroment.yml
MarcSpeckmann Jan 8, 2022
fa987b4
add lime and shap package
MarcSpeckmann Jan 8, 2022
d83052e
added code from original repo and update readme
MarcSpeckmann Jan 11, 2022
8e12ef5
tmp
MarcSpeckmann Jan 13, 2022
97c5000
dsaf
MarcSpeckmann Jan 13, 2022
8ef63c6
regression and multi classification possible
MarcSpeckmann Jan 14, 2022
645a404
new datasets
MarcSpeckmann Jan 17, 2022
b07216e
working pdp fooling
MarcSpeckmann Jan 25, 2022
80ac48a
fixed example tasks
MarcSpeckmann Jan 25, 2022
67108ee
working regression and binary classification
MarcSpeckmann Jan 25, 2022
7c6e8d6
added seed and removed dataset class use
MarcSpeckmann Jan 26, 2022
d968972
temp
MarcSpeckmann Jan 26, 2022
21f6e41
removed dataset completely
MarcSpeckmann Jan 26, 2022
eef3b48
added prediction fidilty to pdp
MarcSpeckmann Jan 26, 2022
58b93ce
created doc stub and test stubs
MarcSpeckmann Jan 28, 2022
e5353ce
removed overhead, preparation for pertubation experiments
MarcSpeckmann Jan 28, 2022
8dbab25
more beautiful prints
MarcSpeckmann Jan 29, 2022
4f0fb65
switch experiments from py to ipynb
MarcSpeckmann Feb 1, 2022
455dd34
new dataset
MarcSpeckmann Feb 1, 2022
be7ff48
docstrings
MarcSpeckmann Feb 2, 2022
2df8222
updated readme.md and new test stubs
MarcSpeckmann Feb 4, 2022
c9346ea
updated readme.md and environment.yml
MarcSpeckmann Feb 4, 2022
f7716b0
updated readme.md and environment.yml
MarcSpeckmann Feb 4, 2022
8902f65
added limitations to readme
MarcSpeckmann Feb 4, 2022
626e744
adde hyperparameter sensibility
MarcSpeckmann Feb 7, 2022
39038a3
docstring for perturbator
MarcSpeckmann Feb 7, 2022
0c34fb2
test stubs perturbator
MarcSpeckmann Feb 7, 2022
8f988c5
added hyperparameter sensibility to readme
MarcSpeckmann Feb 7, 2022
6339ee2
added presentation
MarcSpeckmann Feb 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,36 @@ dmypy.json

# Pyre type checker
.pyre/


# Created by https://www.toptal.com/developers/gitignore/api/pycharm+all,vim
# Edit at https://www.toptal.com/developers/gitignore?templates=pycharm+all,vim

### PyCharm+all ###
# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider
# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839

# User-specific stuff
.idea/**/workspace.xml
.idea/**/tasks.xml
.idea/**/usage.statistics.xml
.idea/**/dictionaries
.idea/**/shelf

# AWS User-specific
.idea/**/aws.xml

# Generated files
.idea/**/contentModel.xml

# Sensitive or high-churn files
.idea/**/dataSources/
.idea/**/dataSources.ids
.idea/**/dataSources.local.xml
.idea/**/sqlDataSources.xml
.idea/**/dynamic.xml
.idea/**/uiDe

.idea/*

.virtual_documents/
88 changes: 72 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,78 @@
## iML-ws21-projects
# Fooling LIME and SHAP

Please refer to the StudIP slides `iML_lecture_projects.pdf`.
Post-hoc explanation techniques that rely on input pertubations, such as LIME and SHAP, are not reliable towards systematic errors and underlying biases.
In this project, the scaffolding technique from Slack et al. should be re-implemented, which effectively should hide the biases of any given classifier.
- Paper Reference: https://arxiv.org/abs/1911.02508
- Code Reference: https://github.com/dylan-slack/Fooling-LIME-SHAP

## Template Repository for iML-Project Submissions

### Requirements for Project Subissions
* use python 3.9. As in the assignments, we recommend working inside a conda environment.
```
conda create -n iML-project python=3.9
conda activate iML-project
```
* clean code
* well documented
* unit tested
* all requirements well documented (use requirements.txt)
* Installation instructions (in this README.md)
* If feasible, run your experiments with several random seeds. Try to create reproducabile results.
## Installation

### Submission Deadline: Feb. 15th 2022, 00:00
1. Clone the repository

```bash
git clone https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon.git
```

2. Create the environment

```bash
cd iml-ws21-projects-fool_the_lemon
conda env create -f "environment.yml"
conda activate iML-project
```

3. Run experiments

To run the experiments notebooks start a jupyterlab server.

How to install jupyterlab: https://github.com/jupyterlab/jupyterlab

```bash
jupyter-lab .
```

The seed for the experiments can be changed. For this, only the seed at the beginning of the notebook has to be changed.



## Experiments

### Reproduction (10)
Implement the approach by writting a simple interface/framework and confirm yiur implementation by using any (tabular) raciscm dataset (e.g. Boston Housing)
- [Reproduction with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/repoduction_with_boston_housing.ipynb)

### Extension (10)

Additionally to LIME and SHAP, incoporate PDP and analyse if it is fool-able, too.
- [PDP with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/fool_pdp_with_boston_housing.ipynb)

### Analysis (5)
Use different perturbation approaches and compare the impact on being fooled.
- [Different perturbation approaches with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/compare_pertubation_approaches_with_boston_housing.ipynb)

### Hyperparameter Sensitivity (10)
Analyze the impact of the hyperparameters of LIME and SHAP (e.g., hyperparameters of the local model and of the pertubation algorithms).

- [Hyperparameter sensitivity LIME](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/hyperparameter_sensitivity_lime.ipynb)
- [Hyperparameter sensitivity SHAP](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/hyperparameter_sensitivity_shap.ipynb)

### New Datasets (5)

Find at least two further (tabular) datasets with a risk of discrimination (that are not mentioned in the paper and study the impact of fooling on them.
- [Gender discrimination dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/new_dataset_gender_discrimination.ipynb)
- [Heart failure prediction dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/new_dataset_heart_failure.ipynb)


## Datasets

- [Boson housing dataset](https://www.kaggle.com/altavish/boston-housing-dataset)
- [Gender discrimination dataset](https://www.kaggle.com/hjmjerry/gender-discrimination)
- [Heart failure prediction dataset](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data)

## Limitations / Further improvement

- The current framework can only deal with regression and binary classification tasks
- Only one biased input feature can get hidden
- Only numerical features are considered
- Currently, 3 perturbation algorithms are implemented
Loading