Simple Fully Connected Neural Network implemented in Matlab
Objective : To understand effect of hyper-parameters (Dropout, Learning Rate, Momentum, Weight Decay) on neural network convergence and generalization.
Dataset: MNIST Subset
Training Set : 3000 Samples [300 x 10]
Validation Set : 1000 Samples [100 x 10]
Testing Set : 3000 [300 x 10]
Model Architecture:
Inputs [784] -> FCN1[500] -> Sigmoid -> Dropout[0.5] -> FCN2[500] -> Sigmoid -> Dropout[0.5] -> Outputs[10]
Implementation:
- Forward Prop, Back Prop Algorithm with SGD with modularity.
Architecture Variations:
- Number of layers in the architecture can be changed by adjusting the params in 'run.m' file
- Activation Units can be changed by specifying the model architecture in 'define_model.m'
- Loss is Cross Entroy Error Function (Negative Log-Likelihood Function)
- Other params : Learning Rate, Momentum, Weight Decay, Dropout, Epochs
Results:
Using all the hyper-parameters mentioned in above table gave best validation accuracy (94%). Testaccuracy with this model is 92.7%
Hyper - Parameter Effects: