This notebook implements a Gaussian mixture model maximum likelihood (GMM-ML) classifier for synthetic (spoofed) speech detection. This approach uses regular mel frequency cepstral coefficients (MFCC) features and gives the best performance on the ASVspoof 2015 dataset among the standard classifiers (GMM-SV, GMM-UBM, ...). For more background information see: Hanilçi, Cemal, Tomi Kinnunen, Md Sahidullah, and Aleksandr Sizov. "Classifiers for synthetic speech detection: a comparison." In INTERSPEECH 2015. The scripts use the Python package Bob.Bio.SPEAR 2.04 for speaker recogntion.
This work is part of the "DDoS Resilient Emergency Dispatch Center" project at the University of Houston, funded by the Department of Homeland Security (DHS).
April 19, 2015
Questions to: lorenzo [dot] rossi [at] gmail [dot] com