Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebVTT, Text and SRT support #6

Open
nodomain opened this issue Jan 14, 2023 · 2 comments
Open

WebVTT, Text and SRT support #6

nodomain opened this issue Jan 14, 2023 · 2 comments

Comments

@nodomain
Copy link
Contributor

Are there plans yet to support other output formats like WebVTT, plain text or SRT?

Still digging through the solution and thinking about adding a converter but I am not sure about the correct approach.

@eoinsha
Copy link
Collaborator

eoinsha commented Jan 16, 2023

Yes, this is definitely something we have discussed and would help improve things like YouTube subtitles.
AWS Transcribe supports generation of WebVTT and SRT subtitles already. Of course, we would prefer to use Whisper segments instead of Transcribe for the subtitle text. The challenge here is that the timing granularity of Whisper output is not as fine as with Transcribe, so you end up with longish segments for each timestamp. This may not be desirable for many subtitle uses.

It may be possible to improve the merging algorithm to match the Transcribe timings to the Whisper output, but that seems not so trivial.

Perhaps the simplest thing initially is to just generate VTT or SRT from the Whisper output. This could be done in a separate Lambda function using the merged transcript output (after the Process Transcripts state).
It could use the JSON object located at processedTranscriptKey as its input.

If you are interested in contributing this feature, that would be very welcome, @nodomain. We are happy to review and support of course.

@nodomain
Copy link
Contributor Author

Looking into it already :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants