Author: Tianhao Liu (20205784)
Email: [email protected] | [email protected]
In this project, I tried two approachs to classify the genres of songs. One is to build deeplearning model from scratch, using CNN, MLP, LSTM, and GRU. Beside this, I also tried to fine tune exsiting pre-trained models (vgg19, resNext, and SqueezeNet)
- Download the GTZAN dataset to this project folder
- Unzip the downloaded database, the name of the downloaded database should be
Data
- If the database is in somewhere else or with other names, please write the path to config file
hparams.yaml
. You need to modify theaudio_dir
andimage_dir
in it. main.ipynb
is all you need.- After training, a trained model will be saved in folder
checkpoints
, and its loss record will be saved in folderlogs
.
Model | Status |
---|---|
MLP | ✅ |
CNN | ✅ |
LSTM | ✅ |
GRU | ✅ |
- install all the packages declared in
requirements.txt
- I have fine tuned 3 pre-trained models, that are:
resnext
,vgg19
, andsqueezenet
. You can find them infinetune-resnext.ipynb
,finetune-vgg.ipynb
, andfinetune-squeezenet.ipynb
respectively.
Model | Status |
---|---|
VGG | ✅ |
ResNext | ✅ |
SqueezeNet | ✅ |
Feature | Status |
---|---|
Early Stop | ✅ |
Batch training | ✅ |
Checkpoint | ✅ |
Log (loss) | ✅ |
Train-test-split | ✅ |
Evaluation | ✅ |