Skip to content

m-usamasaleem/Spotizer-Flask

Repository files navigation

Spotizer

Speaker diarization refers to the process of separating an audio file with various speakers into distinct audio file for each speaker. The partitioning of audio files based on distinct users have various applications across domains. Over the years, various methods have been proposed for speaker diarization. Deep Neural networks in conjunction with deep embeddings such as x-vector have been proven fruitful in such systems as they are able to distinctly identify each user. These systems use time delay neural network architecture, which aids in classifying patterns despite of shift invariance to identify different users. In this project, we have extended the use of the TDNN architecture to perform speaker diarization by implementing ECAPA-TDNN which is an improvement on the previous model. Our results successfully show that the ECAPA-TDNN outperforms the TDNN model.

Installation

Create a python virtual enviroment

Python3 Vitual Enviroments

After your enviroment is created and activated, run:

  pip install -r requirements.txt

Once the requirements are installed, run:

Run Flask server

  • (optional) set flask into development mode. This way changes to the code will update the site automatically
    • Command on Linux: " export FLASK_APP=app.py"
  • Run command "flask run"
  • Navigate to the link printed in the console

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published