This demos sends the same audio to the HuggingFace automatic-speech-recognition pipeline default model facebook/wav2vec2-base-960h and the jonatasgrosman/wav2vec2-large-xlsr-53-english model.
It uses the caikit to serve the model.
The following tools are required:
Note: Before installing dependencies and to avoid conflicts in your environment, it is advisable to use a virtual environment(venv).
Install the dependencies: pip install -r requirements.txt
In one terminal, start the runtime server:
python app.py
Wait it downlaod the models and start the gradio server:
Running on local URL: http://127.0.0.1:8080
Access the browser, select a sample audio or record you own to test it out!