the first & only x space transcription tool that is able to identify speakers and summarize conversations
https://github.com/robbie-wasabi/xspacecadet
Record, transcribe (with speaker identification), and chat directly with X Space transcripts.
- Record X Spaces: Bot joins the space and records the audio while noting the current speaker.
- Audio Processing: Download and process audio using twspace-dl.
- Transcription: Transcribe audio using Insanely Fast Whisper.
- Speaker Identification: Identify speakers in the transcript using captured frames and metadata.
- Chat with Transcript: OpenAI integration to chat with the transcript.
Listening to people ramble on x spaces is often grueling but it is certain that there are diamonds in the rough. For this reason, I searched for tools that recorded spaces but couldn't find anything that was able to actually identify speakers and summarize conversations - so I built a solution myself. I've already dumped too much time into this and it suits my needs but I figured I'd share it in case anyone else finds it useful. Depending on how much interest there is, I'll continue to improve it.
-
Recording Termination: Currently, the only way to stop a recording is to kill the application. We're working on implementing a more graceful shutdown method.
-
Live Spaces Only: Only works for spaces in progress because the bot needs to join the space and identify speakers in real-time. Would love for this to work with completed/recorded spaces but might be tricky since the speaking animations used to identify speakers aren't as reliable.
-
User Experience: The current UX could be significantly improved. Streamlit was chosen to keep development time to a minimum.
-
Code Quality: The codebase is pretty shitty but its in an okay place to begin cleaning it up.
- Python 3.7 but not higher than Python 3.11.10: insanely fast whisper has issues with newer versions of python.
- Chrome WebDriver: Required for Selenium to automate browser actions.
- ffmpeg: Needed for audio conversion.
- X API Key: Basic plan required for accessing X API endpoints.
- Hugging Face API Key: Required for accessing Hugging Face API endpoints.
- OpenAI API Key: Needed for chatbot.
-
Clone the Repository:
git clone https://github.com/robbie-wasabi/xspacecadet.git cd xspacecadet
-
Install Dependencies:
# install python 3.11 brew install [email protected] # create virtual environment python3.11 -m venv .venv source .venv/bin/activate # install dependencies to virtual environment python3.11 -m pip install -r requirements.txt
-
Set Up Environment Variables:
Create a
.env
file in the root directory and add the following variables:X_BEARER
: Your X API bearer token.X_COOKIE_FILE
: Path to your X cookie file.HF_TOKEN
: Your Hugging Face API token.OPENAI_API_KEY
: Your OpenAI API key.
Example
.env
file:X_BEARER=your_x_bearer_token X_COOKIE_FILE=./cookies.txt HF_TOKEN=your_hugging_face_token OPENAI_API_KEY=your_openai_api_key
Note: You can obtain these tokens from:
- X API Bearer Token: X Developer Portal
- X Cookie File: Use the Get cookies.txt LOCALLY Chrome extension.
- Hugging Face Token: Hugging Face Tokens
- OpenAI API Key: OpenAI API Keys
XSpaceCadet can be used via a shitty command-line interface or the Streamlit web application.
To record a Space:
python3.11 main.py record <space_id> [cookie_file] [options]
Example:
python3.11 main.py record https://x.com/i/spaces/AAAAAAAAAAAAA ./cookies.txt
To transcribe the recorded audio and identify speakers:
python3.11 main.py transcribe <space_id>
Example:
python3.11 main.py transcribe AAAAAAAAAAAAA
To fetch metadata for a space:
python3.11 main.py fetch-metadata <space_id>
Example:
python3.11 main.py fetch-metadata AAAAAAAAAAAAA
Start the Streamlit application:
python3.11 -m streamlit run app.py
-
Open your web browser and navigate to
http://localhost:8501
. -
Configure your tokens and settings in the sidebar:
- Hugging Face Token
- X Bearer Token
- Path to X Cookie File
- OpenAI API Key
-
Use the Record Space tab to start recording an X Space by entering the Space ID or URL.
-
Use the Transcribe tab to process recorded spaces, generate transcripts, and summaries.
Contributions are encouraged (please) just open an issue or submit a pr.
project is licensed under the MIT License.
discord: robbie_wasabi