podscript is a tool to generate transcripts for podcasts (and other similar audio files), using LLMs and other Speech-to-Text (STT) APIs.
> go install github.com/deepakjois/podscript@latest
> ~/go/bin/podscript --help
This command displays prompts to enter API keys for supported services, and write them to $HOME/.podscript.toml
.
> podscript configure
Alternatively, you can set keys in environment variables – OPENAI_API_KEY
, ANTHROPIC_API_KEY
, GROQ_API_KEY
, DEEPGRAM_API_KEY
For podcasts on YouTube with autogenerated captions (e.g. Andrew Huberman and Cal Newport), use the ytt
subcommand to download the captions from the YouTube video and feed it to an LLM model to generate a clean transcript.
> podscript ytt https://www.youtube.com/watch?v=aO1-6X_f74M
It uses the gpt-4o
model by default. Use --model
flag to set a different model. The following are supported:
gpt-4o
gpt-4o-mini
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
llama-3.3-70b-versatile
llama-3.1-8b-instant
If the podcast (or other audio) is not available on YouTube, or you need higher quality transcripts, you can send the audio to a Speech-to-Text (STT) API.
podscript supports the following STT APIs:
- Deepgram (which as of Jan 2025 provides $200 free signup credit!)
- Groq (which as of Jul 2024 is in beta and free to use within your rate limits).
- Assembly AI (which as of Oct 2024 is free to use within your credit limits and they provide $50 credits free on signup).
Deepgram and AssemblyAI subcommands support --from-url
for passing audio URLs, and --from-file
to pass audio files. Groq only supports audio files.
All the subcommands support the -o
flag to write the output to a text file. Other options to set the model, or dump the full API response are provided where available.
> podscript deepgram --from-url https://audio.listennotes.com/e/p/d6cc86364eb540c1a30a1cac2b77b82c/
> podscript groq --file huberman.mp3
> podscript assembly-ai --from-url https://audio.listennotes.com/e/p/d6cc86364eb540c1a30a1cac2b77b82c/ -o transcript.txt