Repository containing the code projects for the course Machine Learning For Healthcare. The course was taught at ETH Zurich in Spring Semester 2024.
Group components: Sophia Houhamdi, Eva Sarlin, Lorenzo Tarricone
Each of the two projects contains a .md
file that details how to run the code
Project 1 focuses on interpretable and explainable classification for medical data using machine-learning methods. It consists of three parts:
- The first part involves tabular data analysis, specifically using the Kaggle Heart Failure Prediction Dataset, to explore features, handle data pitfalls, and implement interpretable models like Logistic Lasso Regression.
- The second part deals with imaging data analysis, utilizing the Kaggle Chest X-Ray Images (Pneumonia) dataset. It includes tasks such as CNN classification, Integrated Gradients, and Grad-CAM for interpretability.
- The third part involves summarizing findings, answering general questions about the methods used, and exploring techniques for interpretable classification with shallow and deep machine-learning models. The goal is to gain insights beyond predictive performance by understanding feature importance and model interpretability.
Project 2 focuses on working with time series from ECG datasets to classify the status of the patient. We analyzed the publicly available PTB Diagnostic ECG Database and MIT-BIH Arrhythmia Database It consists of three parts:
- The first part consists of producing many different models for the classification of the PTB dataset. These include classic models (logistic regression, Random Forests, Support Vector Machine) and more recent Deep Learning architectures (CNN, LSTM and Transformer encoder)
- The second part deals with creating good embeddings by exploring different techniques of representation and transfer learning.
- The third part involves summarizing findings, answering general questions about the methods used, and exploring the pros and cons of all the models implemented.