This repository is used for anomalous sound detection task and provides training, prediction, prediction GUI, prediction Web GUI, hyperparameters tuning, and automatic calculate threshold.
pip install -r requirements.txt
Please prepare the dataset according to the following examples.
dataset/
├── train
│ └── normal
│ ├── wav1.wav
│ ├── wav2.wav
│ └── wav3.wav
├── val
│ └── normal
│ ├── wav1.wav
│ ├── wav2.wav
│ └── wav3.wav
└── test
├── abnormal
│ ├── wav1.wav
│ ├── wav2.wav
│ └── wav3.wav
└── normal
├── wav1.wav
├── wav2.wav
└── wav3.wav
This repository provides the default configuration, which is SPEECHCOMMANDS.
All parameters are in the YAML file.
You can override parameters by argparse while running.
python main.py --config config.yaml --str_kwargs mode=train #override mode as 100
python main.py --config config.yaml --num_kwargs max_epochs=100 #override training iteration as 100
python main.py --config config.yaml --bool_kwargs early_stopping=False #override early_stopping as False
python main.py --config config.yaml --str_list_kwargs classes=1,2,3 #override classes as 1,2,3
python main.py --config config.yaml --dont_check #don't check configuration
python main.py --config config.yml --str_kwargs mode=train # or you can set train as the value of mode in configuration
python main.py --config config.yml --str_kwargs mode=predict,root=FILE # predict a file
python main.py --config config.yml --str_kwargs mode=predict,root=DIRECTORY # predict files in the folder
python main.py --config config.yml --str_kwargs mode=predict_gui # will create a tkinter window
python main.py --config config.yml --str_kwargs mode=predict_gui --bool_kwargs web_interface=True #will create a web interface by Gradio
python main.py --config config.yaml --str_kwargs mode=tuning #the hyperparameter space is in the configuration