This tool allows you to isolate speakers from a YouTube video using AssemblyAI's Speaker Diarization API.
- Python 3.x
- AssemblyAI API key
-
Clone the repository:
git clone https://github.com/AayushGupta16/speaker-isolator-api.git
-
Navigate to the project directory:
cd speaker-isolator-api
-
Create a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
-
For Windows:
venv\Scripts\activate
-
For macOS and Linux:
source venv/bin/activate
-
-
Install the required dependencies:
python3 -m pip install -r requirements.txt
-
Create a
.env
file in the project root directory. -
Add your AssemblyAI API key to the
.env
file:ASSEMBLY_AI_API_KEY="YOUR_API_KEY"
Replace
"YOUR_API_KEY"
with your actual AssemblyAI API key, including the double quotes.
-
Run the Flask application:
python3 main.py
-
Make a
POST
request to the/process_video
endpoint with the following JSON payload:{ "youtube_url": "YOUR_YOUTUBE_VIDEO_URL" }
You can use the following
curl
command to make the request:curl -X POST -H "Content-Type: application/json" -d '{"youtube_url": "YOUR_YOUTUBE_VIDEO_URL"}' -o speaker_segments.zip http://localhost:8000/process_video
Replace
YOUR_YOUTUBE_VIDEO_URL
with the actual URL of the YouTube video you want to process. -
The API will process the YouTube video, isolate speaker segments, and return a ZIP file named
speaker_segments.zip
containing the output audio files.
The tool includes error handling for the following scenarios:
- Invalid request payload
- Invalid YouTube URL
- Missing API key
- Errors during YouTube video download
- Errors during audio upload to AssemblyAI
- Errors during transcription process
- Errors during speaker segment creation
In case of an error, an appropriate HTTP status code and error description will be returned.
The tool uses Python's logging
module for logging. Log messages are output to the console with timestamps and log levels.
This project is distributed under the MIT License. See the LICENSE file for more information.