Skip to content

Generate podcast transcripts using language and speech-to-text models

License

Notifications You must be signed in to change notification settings

deepakjois/podscript

Repository files navigation

podscript

podscript is a tool to generate transcripts for podcasts (and other similar audio files), using LLMs and other Speech-to-Text (STT) APIs.

Install

> go install github.com/deepakjois/podscript@latest

> ~/go/bin/podscript --help

Configure

This command displays prompts to enter API keys for supported services, and write them to $HOME/.podscript.toml.

> podscript configure

Alternatively, you can set keys in environment variables – OPENAI_API_KEY, ANTHROPIC_API_KEY, GROQ_API_KEY, DEEPGRAM_API_KEY

Transcript from YouTube videos

For podcasts on YouTube with autogenerated captions (e.g. Andrew Huberman and Cal Newport), use the ytt subcommand to download the captions from the YouTube video and feed it to an LLM model to generate a clean transcript.

> podscript ytt https://www.youtube.com/watch?v=aO1-6X_f74M

It uses the gpt-4o model by default. Use --model flag to set a different model. The following are supported:

  • gpt-4o
  • gpt-4o-mini
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-20241022
  • llama-3.3-70b-versatile
  • llama-3.1-8b-instant

Transcript from audio URLs and files

If the podcast (or other audio) is not available on YouTube, or you need higher quality transcripts, you can send the audio to a Speech-to-Text (STT) API.

podscript supports the following STT APIs:

  • Deepgram (which as of Jan 2025 provides $200 free signup credit!)
  • Groq (which as of Jul 2024 is in beta and free to use within your rate limits).
  • Assembly AI (which as of Oct 2024 is free to use within your credit limits and they provide $50 credits free on signup).

Tip

You can find the audio download link for a podcast on ListenNotes under the More menu

image

Example Usage

Deepgram and AssemblyAI subcommands support --from-url for passing audio URLs, and --from-file to pass audio files. Groq only supports audio files.

All the subcommands support the -o flag to write the output to a text file. Other options to set the model, or dump the full API response are provided where available.

Deepgram

> podscript deepgram --from-url  https://audio.listennotes.com/e/p/d6cc86364eb540c1a30a1cac2b77b82c/

Groq

> podscript groq --file huberman.mp3

AssemblyAI

> podscript assembly-ai --from-url https://audio.listennotes.com/e/p/d6cc86364eb540c1a30a1cac2b77b82c/ -o transcript.txt

Feedback

Feel free to drop me a note on X or Email Me

License

MIT

About

Generate podcast transcripts using language and speech-to-text models

Topics

Resources

License

Stars

Watchers

Forks

Languages