MarcSpeckmann · github-classroom · Jan 8, 2022 · Jan 8, 2022 · Jan 8, 2022 · Jan 8, 2022
diff --git a/.gitignore b/.gitignore
@@ -127,3 +127,36 @@ dmypy.json
 
 # Pyre type checker
 .pyre/
+
+
+# Created by https://www.toptal.com/developers/gitignore/api/pycharm+all,vim
+# Edit at https://www.toptal.com/developers/gitignore?templates=pycharm+all,vim
+
+### PyCharm+all ###
+# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider
+# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839
+
+# User-specific stuff
+.idea/**/workspace.xml
+.idea/**/tasks.xml
+.idea/**/usage.statistics.xml
+.idea/**/dictionaries
+.idea/**/shelf
+
+# AWS User-specific
+.idea/**/aws.xml
+
+# Generated files
+.idea/**/contentModel.xml
+
+# Sensitive or high-churn files
+.idea/**/dataSources/
+.idea/**/dataSources.ids
+.idea/**/dataSources.local.xml
+.idea/**/sqlDataSources.xml
+.idea/**/dynamic.xml
+.idea/**/uiDe
+
+.idea/*
+
+.virtual_documents/
diff --git a/README.md b/README.md
@@ -1,22 +1,78 @@
-## iML-ws21-projects
+# Fooling LIME and SHAP
 
-Please refer to the StudIP slides `iML_lecture_projects.pdf`.
+Post-hoc explanation techniques that rely on input pertubations, such as LIME and SHAP, are not reliable towards systematic errors and underlying biases.
+In this project, the scaffolding technique from Slack et al. should be re-implemented, which effectively should hide the biases of any given classifier.
+- Paper Reference: https://arxiv.org/abs/1911.02508
+- Code Reference: https://github.com/dylan-slack/Fooling-LIME-SHAP
 
-## Template Repository for iML-Project Submissions
 
-### Requirements for Project Subissions
-* use python 3.9. As in the assignments, we recommend working inside a conda environment.
-  ```
-  conda create -n iML-project python=3.9
-  conda activate iML-project
-  ```
-* clean code
-* well documented
-* unit tested
-* all requirements well documented (use requirements.txt)
-* Installation instructions (in this README.md)
-* If feasible, run your experiments with several random seeds. Try to create reproducabile results.
+## Installation
 
-### Submission Deadline: Feb. 15th 2022, 00:00
+1. Clone the repository
 
+    ```bash
+    git clone https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon.git
+    ```
 
+2. Create the environment
+
+    ```bash
+    cd iml-ws21-projects-fool_the_lemon
+    conda env create -f "environment.yml"
+    conda activate iML-project
+    ```
+
+3. Run experiments
+
+    To run the experiments notebooks start a jupyterlab server.
+
+    How to install jupyterlab: https://github.com/jupyterlab/jupyterlab
+
+    ```bash
+    jupyter-lab .
+    ```
+
+   The seed for the experiments can be changed. For this, only the seed at the beginning of the notebook has to be changed.
+
+
+
+## Experiments 
+
+### Reproduction (10)
+Implement the approach by writting a simple interface/framework and confirm yiur implementation by using any (tabular) raciscm dataset (e.g. Boston Housing)
+- [Reproduction with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/repoduction_with_boston_housing.ipynb)
+
+### Extension (10)
+
+Additionally to LIME and SHAP, incoporate PDP and analyse if it is fool-able, too.
+- [PDP with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/fool_pdp_with_boston_housing.ipynb)
+
+### Analysis (5)
+Use different perturbation approaches and compare the impact on being fooled.
+- [Different perturbation approaches with boston housing dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/compare_pertubation_approaches_with_boston_housing.ipynb)
+
+### Hyperparameter Sensitivity (10)
+Analyze the impact of the hyperparameters of LIME and SHAP (e.g., hyperparameters of the local model and of the pertubation algorithms).
+
+- [Hyperparameter sensitivity LIME](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/hyperparameter_sensitivity_lime.ipynb)
+- [Hyperparameter sensitivity SHAP](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/hyperparameter_sensitivity_shap.ipynb)
+
+### New Datasets (5)
+
+Find at least two further (tabular) datasets with a risk of discrimination (that are not mentioned in the paper and study the impact of fooling on them.
+- [Gender discrimination dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/new_dataset_gender_discrimination.ipynb)
+- [Heart failure prediction dataset](https://github.com/automl-classroom/iml-ws21-projects-fool_the_lemon/blob/main/new_dataset_heart_failure.ipynb)
+
+
+## Datasets
+
+- [Boson housing dataset](https://www.kaggle.com/altavish/boston-housing-dataset)
+- [Gender discrimination dataset](https://www.kaggle.com/hjmjerry/gender-discrimination)
+- [Heart failure prediction dataset](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data)
+
+## Limitations / Further improvement
+
+- The current framework can only deal with regression and binary classification tasks
+- Only one biased input feature can get hidden
+- Only numerical features are considered
+- Currently, 3 perturbation algorithms are implemented