🧠📊 Neural Network Exploration on OrganAMNIST Dataset

🔍 A Comparative Study of MLP, CNN, and Transformer Models for Medical Image Classification 🔍

Authors
Mingshu Liu, Kaibo Zhang, and Alek Bedard
Affiliation: McGill University, This project is carried out under the supervision of Professors Isabeau Prémont-Schwarz and Reihaneh Rabbany. It is a part of the coursework of COMP551 Applied Machine Learning.

Overview

This project investigates the impact of neural network architectures and design decisions on medical image classification using the OrganAMNIST dataset. We evaluated models ranging from MLPs to CNNs and Transformers, exploring how architectural complexity, activation functions, regularization techniques, and input resolution influence performance. Our findings highlight the superiority of CNNs for spatial feature extraction and the transformative potential of pre-trained Vision Transformer (ViT) models for achieving state-of-the-art results.

Key contributions include:

Analysis of MLP depth and performance trade-offs.
Evaluation of regularization techniques (L1, L2).
Examination of input normalization and image resolution effects.
Design of a modified CNN architecture with tuned hyperparameters.
Comparison of pre-trained ResNet101 and Vision Transformer models.

Dataset Description

The OrganAMNIST dataset consists of grayscale images (11 organ categories) for multi-class classification:

Training Samples: 34,561
Validation Samples: 6,491
Test Samples: 17,778
Each image is resized to 28x28 pixels (base experiments) and optionally 128x128 pixels (high-resolution experiments). Exploratory analysis revealed class imbalance, requiring advanced regularization and preprocessing for effective training.

Step-by-Step Experiments

MLP Architecture Analysis
- Explored the effect of increasing depth (no hidden layer, 1-layer, 2-layer models).
- Performance improved with depth due to better feature extraction but plateaued due to dataset complexity.
- Achieved test accuracies: 55.41% (0-layer), 73.10% (1-layer), and 75.64% (2-layer).
Activation Functions in MLPs
- Compared ReLU, Tanh, and Leaky ReLU.
- Leaky ReLU achieved the best performance due to consistent gradient updates for negative values.
Regularization Techniques
- Evaluated L1 and L2 regularizations.
- L2 Regularization preserved model capacity better than L1, achieving higher generalization.
CNN vs. MLP Comparison
- Regular CNN models outperformed MLPs by leveraging spatial hierarchies.
- Achieved test accuracy: 79.2% with a 2-convolutional-layer CNN.
Modified CNN on 128x128 Data
- Enhanced CNN with tuned hyperparameters (conv1=64, conv2=256, fc_neurons=512, pooling kernel/stride=3).
- Achieved test accuracy: 88.7%, demonstrating significant gains over the MLP and regular CNN.
Fine-Tuned Vision Transformer (ViT)
- Fine-tuned a pre-trained ViT model with a 224x224 dataset.
- Achieved the best test accuracy: 94.5%, leveraging self-attention mechanisms for spatial and contextual feature extraction.

Results Summary

Model	Test Accuracy	AUC
MLP (2 layers, ReLU)	72.49%	-
CNN (28x28)	79.20%	0.974
Modified CNN (128x128)	88.70%	0.991
ResNet101 (fine-tuned)	84.40%	0.985
Vision Transformer (ViT)	94.50%	1.00

Insights and Future Work

Normalization is critical for training stability and performance.
CNNs excel in medical imaging tasks by capturing spatial hierarchies and feature patterns.
Transformers like ViT outperform CNNs by leveraging attention mechanisms, especially with higher-resolution data.

Future explorations could include:

Hybrid CNN-Transformer architectures.
Advanced interpretability tools like Grad-CAM.
Addressing dataset imbalance through data augmentation.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Group14_A3_Code.ipynb		Group14_A3_Code.ipynb
Group14_A3_Writeup.pdf		Group14_A3_Writeup.pdf
README.md		README.md
ResNet_best_model.pth		ResNet_best_model.pth
gpu-grad-01-138.out		gpu-grad-01-138.out
gpu-grad-01-199.out		gpu-grad-01-199.out
gpu-grad-01-75.out		gpu-grad-01-75.out
modifiedCNN_best_model.pth		modifiedCNN_best_model.pth
simpleCNN_best_model.pth		simpleCNN_best_model.pth
train_ResNet.py		train_ResNet.py
train_ViT.ipynb		train_ViT.ipynb
train_modifiedCNN.py		train_modifiedCNN.py
train_simpleCNN.py		train_simpleCNN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠📊 Neural Network Exploration on OrganAMNIST Dataset

Overview

Dataset Description

Step-by-Step Experiments

Results Summary

Insights and Future Work

About

Releases

Packages

Languages

kbzh2558/Image_Classification_on_OrganAMNIST

Folders and files

Latest commit

History

Repository files navigation

🧠📊 Neural Network Exploration on OrganAMNIST Dataset

Overview

Dataset Description

Step-by-Step Experiments

Results Summary

Insights and Future Work

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages