This project involves preprocessing MRI scans of patients with Parkinson's Disease (PD) and applying a Generative Adversarial Network (GAN) to generate synthetic MRI images. The main objectives are to augment the dataset and enhance the performance of machine learning models used for diagnosing Parkinson's Disease.
- Data Acquisition: Downloading and extracting MRI data.
- Preprocessing: Loading and padding MRI images, balancing the dataset using SMOTE, and augmenting the data.
- GAN Implementation: Training a GAN to generate synthetic MRI images.
- Data Storage: Saving preprocessed and augmented data using HDF5 format.
- Python 3.x
- Libraries:
- NumPy
- Pandas
- nibabel
- h5py
- imbalanced-learn
- torchio
- PyTorch
You can install the required libraries using pip:
pip install numpy pandas nibabel h5py imbalanced-learn torchio torch
The MRI data is acquired from the following source:
The project automatically downloads and extracts the dataset into the specified folder.
- Loading Metadata: The patient metadata is loaded from a TSV file, which includes information on the patient status (Control or PD).
- Loading MRI Scans: MRI images are loaded from the dataset, and their dimensions are recorded.
- Padding: MRI scans are padded to ensure uniform dimensions.
- SMOTE Application: The dataset is balanced using Synthetic Minority Over-sampling Technique (SMOTE).
- Data Augmentation: Various augmentation techniques are applied to enhance the dataset, including flipping, affine transformations, noise addition, and elastic deformations.
- Saving Data: The preprocessed and augmented MRI data is saved in HDF5 format for further use.
The GAN consists of two main components:
- Generator: Generates synthetic MRI images from random noise.
- Discriminator: Classifies images as real (from the dataset) or fake (generated by the GAN).
The architecture uses 3D convolutional layers to handle the MRI data efficiently.
The training process includes the following steps:
- Loading the augmented data.
- Training the discriminator on real and synthetic images.
- Training the generator to produce images that can fool the discriminator.
- Saving checkpoints during training for resuming later.
To run the entire pipeline, execute the Python scripts in the following order:
- Data Acquisition and Preprocessing
- GAN Training
Ensure that you have the dataset downloaded and extracted before running the scripts.
Contributions are welcome! Please feel free to submit a pull request or open an issue for discussion.
This project is licensed under the MIT License. See the LICENSE file for more details.
- Special thanks to the authors and contributors of the libraries and datasets used in this project.