Voice to Voice chatbot using Whisper + Open AI + 11labs
A voice-to-voice chatbot powered by OpenAI GPT-3.5 Turbo and ElevenLabs. The chatbot transcribes the audio input using OpenAI's Whisper ASR system, generates a response with GPT-3.5 Turbo, and then converts the response into spoken audio using the ElevenLabs Text-to-Speech API.
- Clone this repository.
- Install the required packages by running
pip install -r requirements.txt
(you should have Python 3.6 or later installed). - Create a
.env
file in the root directory of this project and set your OpenAI API key and ElevenLabs API key.
Example .env
file:
You can start the Flask server by running the command: python app.py
. This will start the server on http://127.0.0.1:5000/
.
GET /
- Render the index page.POST /transcribe
- Transcribe the given audio to text using Whisper. Requires afile
parameter with the file data.POST /ask
- Generate a ChatGPT response from the given conversation, then convert it to audio using ElevenLabs. Requires aconversation
parameter with the conversation data.GET /listen/<filename>
- Return the audio file located at the given filename.
This project is licensed under the terms of the MIT license.