-
Notifications
You must be signed in to change notification settings - Fork 9
Twitter Space Mechanics
This page attempts to explain how Twitter Spaces operate, how we are able to download them, and have a look at the inner-workings of Twitter Spaces under the hood.
Twitter Spaces are basically audio-centralized chatrooms on Twitter. They are a 'space' for users to chat with other users who are like minded, want to have a conversation, hang out, etc. I think you get the basic idea.
When a Twitter Space is created, users have two choices:
-
Archived: The audio from the Space is saved and available for replay at a later date. Everything is preserved.
-
Unarchived: The audio from the Space is deleted from Twitter's servers after the Space ends, and users are unable to replay the Space after it ends. However, as we'll see momentarily, this is easily overridable.
The transmission of audio in a Twitter Space is achieved via a set of m3u8 playlists. M3U8 is used to catalogue and keep track of multimedia files for streaming on the internet. Twitter Spaces utilize these playlists to transmit short, 3-second-long audio files from the host to the listener. Each audio chunk is encoded in AAC format, and occasionally has id3 errors that prevent concatenation with FFMPEG, so we have to 'clean' the files before we concatenate them. We use Python's mutagen library for this; we strip all of the id3 tags from the chunks before we use os.subprocess
to run FFMPEG and concatenate the AAC files in to one, long coherent file.
Twitter Spaces are accompanied by two m3u8 playlists:
-
Master Playlists: This playlist doesn't actually contain the direct links to the audio files (yet), but it instead directs us to what I like to call the 'true master playlist'. This 'true master playlist' contains the m3u8 encapsulating each chunk for the Twitter Space.
-
Dynamic Playlists: Dynamic Playlists function differently than master playlists do. Instead of directing to a playlist containing every chunk in the Twitter Space, we are only provided with a few at a time, hence 'dynamic'. We are still given a link that directs us to the playlist, however, the link changes dynamically as the Twitter Space progresses. This playlist gets terminated when the Space ends.
NOTE: Even after a Space ends, if it was created as an archived Space, we are still able to download it because the 'true master playlist' is preserved, regardless of archival state.
To actually get these playlists from a Twitter Space, we must initialte a series of API requests that revolve around the media_key
field in the Space's JSON data. The JSON that is returned provides us with the following:
- dynamic url: A url to the dynamic playlist
- no redirect dynamic url: I'm not too sure what the purpose of this is but it seems to be the same as the dynamic playlist.
- session id: An ID to your current session
- chat token: A JWS token to retrieve the chat from the Twitter Space
There is other information, however, this is the bulk of it. We replace 'dynamic' with 'master' in the dynamic url to obtain the master playlist. We then perform a final request using the url fragment provided in the master playlist to retrieve the 'true master playlist'.