Skip to content

Rework of the original Infant Cries implementation.

Notifications You must be signed in to change notification settings

timherzig/infant_cry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Infant Cry

Implementation for the detection of infant cries using the TRILLsson speech representations.

Installation

apt-get update && apt-get install -y libsndfile1 ffmpeg
conda create -p ./venv python=3.9
conda activate ./venv
pip install -r requirements.txt

Running

To run a training, call:

python run.py --config 'path/to/config.yaml'

with the config linking to compatible data.

Results

Training using the configuration found in 'config/config_cluster.yaml' the model achieves an F1 score of 0.899 on previously unseen data (unseen babies). This model was trained on non-augmented data, with a batch size of 8 and a learning rate of 0.001. The model was trained on a single GPU (RTXA6000 48GB). Other experiments were run using augmented data which previously had led to improved performance. Augmentation methods can be found the repository https://github.com/timherzig/speech_augment, and using generated room impulse responses from a modified version of anton-jeran/FAST-RIR found here: https://github.com/timherzig/FAST-RIR.

About

Rework of the original Infant Cries implementation.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages