Skip to content

Commit

Permalink
Merge pull request #417 from apivideo/Add-video-transcript-feature
Browse files Browse the repository at this point in the history
Add video transcript feature
  • Loading branch information
szekelyzol authored Oct 9, 2024
2 parents 27da038 + 6417550 commit 49b9043
Show file tree
Hide file tree
Showing 4 changed files with 240 additions and 7 deletions.
101 changes: 99 additions & 2 deletions openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,8 @@ paths:
playerId: pl45KFKdlddgk654dspkze
title: Maths video
description: An amazing video explaining the string theory
language: 'en'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -299,6 +301,8 @@ paths:
- videoId: vi4blUQJFrYWbaG44NChkH27
title: Video Title
description: A description for your video.
language: 'en'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -329,6 +333,8 @@ paths:
playerId: pl45KFKdlddgk654dspkze
title: My Video Title
description: A brief description of the video.
language: 'fr'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -564,6 +570,8 @@ paths:
videoId: vi4blUQJFrYWbaG44NChkH27
title: Maths video
description: An amazing video explaining the string theory
language: 'en'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -628,7 +636,7 @@ paths:
schema:
$ref: '#/components/schemas/bad-request'
examples:
response:
Arttribute required:
value:
type: 'https://docs.api.video/reference/attribute-required'
title: This attribute is required.
Expand All @@ -647,6 +655,22 @@ paths:
- type: 'https://docs.api.video/reference/invalid-attribute'
title: This attribute must be an array.
name: metadata
Invalid language formatting:
description: This error occurs when the language tag you provided contains characters other than letters and dashes.
value:
type: https://docs.api.video/reference/invalid-attribute
title: An attribute is invalid.
status: 400
detail: The "language" attribute must contain only letters and dashes (for example "fr", "fr-BE").
name: language
Invalid language:
description: This error occurs when the language tag you provided does not match any supported language.
value:
type: https://docs.api.video/reference/invalid-attribute
title: An attribute is invalid.
status: 400
detail: The "language" attribute is not valid.
name: language
'429':
headers:
X-RateLimit-Limit:
Expand Down Expand Up @@ -2713,6 +2737,8 @@ paths:
playerId: pl45KFKdlddgk654dspkze
title: Maths video
description: An amazing video explaining string theory
language: 'en'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -3235,6 +3261,8 @@ paths:
playerId: pl45KFKdlddgk654dspkze
title: Maths video
description: An amazing video explaining the string theory
language: 'en'
languageOrigin: 'api'
public: false
panoramic: false
mp4Support: true
Expand Down Expand Up @@ -3280,7 +3308,7 @@ paths:
schema:
$ref: '#/components/schemas/bad-request'
examples:
response:
Invalid attribute:
value:
type: 'https://docs.api.video/reference/invalid-attribute'
title: This attribute must be a ISO-8601 date.
Expand All @@ -3296,6 +3324,22 @@ paths:
- type: 'https://docs.api.video/reference/invalid-attribute'
title: This attribute must be an array.
name: metadata
Invalid language formatting:
description: This error occurs when the language tag you provided contains characters other than letters and dashes.
value:
type: https://docs.api.video/reference/invalid-attribute
title: An attribute is invalid.
status: 400
detail: The "language" attribute must contain only letters and dashes (for example "fr", "fr-BE").
name: language
Invalid language:
description: This error occurs when the language tag you provided does not match any supported language.
value:
type: https://docs.api.video/reference/invalid-attribute
title: An attribute is invalid.
status: 400
detail: The "language" attribute is not valid.
name: language
'404':
headers:
X-RateLimit-Limit:
Expand Down Expand Up @@ -5675,6 +5719,7 @@ paths:
playerId: pl45KFKdlddgk654dspkze
title: Maths video
description: An amazing video explaining the string theory
language: 'en'
public: false
panoramic: false
tags:
Expand Down Expand Up @@ -14686,6 +14731,18 @@ components:
discarded:
type: boolean
description: Returns `true` for videos you discarded when you have the Video Restore feature enabled. Returns `false` for every other video.
language:
type: string
description: Returns the language of a video in [IETF language tag](https://en.wikipedia.org/wiki/IETF_language_tag) format. You can set the language during video creation via the API, otherwise it is detected automatically.
languageOrigin:
type: string
enum: [api, auto]
nullable: true
description: |-
Returns the origin of the last update on the video's `language` attribute.

- `api` means that the last update was requested from the API.
- `auto` means that the last update was done automatically by the API.
tags:
type: array
description: |
Expand Down Expand Up @@ -14730,6 +14787,8 @@ components:
videoId: vi4k0jvEUuaTdRAEjQ4Jfrgz
title: Maths video
description: An amazing video explaining the string theory
language: 'en'
languageOrigin: 'api'
tags:
- maths
- string theory
Expand Down Expand Up @@ -15323,6 +15382,8 @@ components:
panoramic: false
mp4Support: true
playerId: pl45KFKdlddgk654dspkze
language: en
transcript: true
tags:
- maths
- string theory
Expand Down Expand Up @@ -15386,6 +15447,23 @@ components:
$ref: '#/components/schemas/video-clip'
watermark:
$ref: '#/components/schemas/video-watermark'
language:
type: string
enum: [ar, ca, cs, da, de, el, en, es, fa, fi, fr, he, hi, hr, hu, it, ja, ko, ml, nl, nn, no, pl, pt, ru, sk, sl, te, tr, uk, ur, vi, zh]
example: fr
description: |-
Use this parameter to set the language of the video. When this parameter is set, the API creates a transcript of the video using the language you specify. You must use the [IETF language tag](https://en.wikipedia.org/wiki/IETF_language_tag) format.

`language` is a permanent attribute of the video. You can update it to another language using the [`PATCH /videos/{videoId}`](https://docs.api.video/reference/api/Videos#update-a-video-object) operation. This triggers the API to generate a new transcript using a different language.
transcript:
type: boolean
description: |-
Use this parameter to enable transcription.

- When `true`, the API generates a transcript for the video.
- The default value is `false`.
- If you define a video language using the `language` parameter, the API uses that language to transcribe the video. If you do not define a language, the API detects it based on the video.
- When the API generates a transcript, it will be available as a caption for the video.
required:
- title
video-upload-payload:
Expand Down Expand Up @@ -15467,11 +15545,30 @@ components:
description: 'A list (array) of dictionaries where each dictionary contains a key value pair that describes the video. As with tags, you must send the complete list of metadata you want as whatever you send here will overwrite the existing metadata for the video.'
items:
$ref: '#/components/schemas/metadata'
language:
type: string
enum: [ar, ca, cs, da, de, el, en, es, fa, fi, fr, he, hi, hr, hu, it, ja, ko, ml, nl, nn, no, pl, pt, ru, sk, sl, te, tr, uk, ur, vi, zh]
example: fr
description: |-
Use this parameter to set the language of the video. When this parameter is set, the API creates a transcript of the video using the language you specify. You must use the [IETF language tag](https://en.wikipedia.org/wiki/IETF_language_tag) format.

`language` is a permanent attribute of the video. You can update it to another language using the [`PATCH /videos/{videoId}`](https://docs.api.video/reference/api/Videos#update-a-video-object) operation. This triggers the API to generate a new transcript using a different language.
transcript:
type: boolean
description: |-
Use this parameter to enable transcription.

- When `true`, the API generates a transcript for the video.
- The default value is `false`.
- If you define a video language using the `language` parameter, the API uses that language to transcribe the video. If you do not define a language, the API detects it based on the video.
- When the API generates a transcript, it will be available as a caption for the video.
example:
playerId: pl45KFKdlddgk654dspkze
title: String theory
description: An amazing video explaining the string theory
public: false
language: 'en'
transcript: true
panoramic: false
mp4Support: true
tags:
Expand Down
1 change: 1 addition & 0 deletions vod/add-captions.md
Original file line number Diff line number Diff line change
Expand Up @@ -732,5 +732,6 @@ WEBVTT

## Tutorials & Resources

- [How to create captions using api.video's AI-driven transcription feature](/vod/generate-transcripts)
- [How to programmatically add captions to your video](https://api.video/blog/tutorials/how-to-add-captions-to-your-videos/)
- [Free online VTT creator](https://www.vtt-creator.com/)
133 changes: 133 additions & 0 deletions vod/generate-transcripts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
title: Generate transcripts for videos via the API
meta:
description: This page gets you started on how to enable automatic transcription for videos in multiple languages using the Videos endpoint.
---

# Generating video transcripts

api.video's AI-driven transcription feature can generate video transcripts using a single API call. You can now avoid manually uploading captions, as the generated transcripts are available as video captions in the standard WebVTT format.

Enable your audience to have seamless user experience regardless of their language or location, and also provide more inclusive and accessible content by inviting deaf or hard-of-hearing users!

## How to generate transcripts

To enable transcription, set these **optional** parameters when you create a video object using a `POST` request to the [Create video object endpoint](/reference/api/Videos#create-a-video-object):

| Field | Type | Description |
|--------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `transcript` | `boolean` | When `true`, the API generates a transcript for the video. The default value is `false`. |
| `language` | `string` | A valid language identifier using [IETF language tags](https://en.wikipedia.org/wiki/IETF_language_tag). You can use primary subtags like `en` or `fr`.<br/><br/>When the value in your request does not match any covered language, the API returns an error. |

The API generates transcripts using the `transcript` parameter. You can define the video `language`, and the API creates a transcript of the video using the language you specify. If you do not specify a language for the video, the API will detect it automatically.

<Callout pad="2" type="info">

If you do not set the `language` parameter, the API analyzes the first `30 seconds` of the video and determines the language. When this analysis does not have confident results, for example in case of low quality audio, the API does not generate a transcript.
</Callout>

To help you understand how a video's language was defined, the API returns the `languageOrigin` attribute in the response when you create a video object:

| Field | Type | Description |
|--------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `languageOrigin` | `string` | Returns the origin of the last update on the video's `language` attribute.<br/><br/>The possible values are: `api` and `auto`.<br/><br/>- `api` means that the last update was requested from the API.<br/>- `auto` means that the last update was done automatically by the API. |

## About the `language` parameter

When you set the `language` parameter, make sure that it matches the actual language used in the video. Your setting forces the API to transcribe in that language. Mismatching language settings or videos with dialogue in multiple languages can return low quality transcripts.

`language` is a permanent attribute of a video object. You can update it to another language using the [`PATCH /videos/{videoId}`](/reference/api/Videos#update-a-video-object) operation. This triggers the API to generate a new transcript using a different language.

When the API generates a transcript, it will be available as a caption for your video. Your audience will be able to select during video playback. You can interact with captions generated through transcription using the [Captions endpoints](/reference/api/Captions).

## Examples

Transcribe a video regardless of the language, using `POST /videos`:

```json
{
"title": "An awesome video",
"transcript": true
}
```

Transcribe a video and force the source language to English, using `POST /videos`:

```json
{
"title": "Yet another awesome video",
"language": "en",
"transcript": true
}
```

Generate transcript for a previously uploaded video, using `PATCH /videos/{videoId}`:

```json
{
"language": "en",
"transcript": true
}
```

Set a default language for a new video, using `POST /videos`:

```json
{
"title": "Another awesome video",
"language": "en"
}
```

Set a default language for an existing video, using `PATCH /videos/{videoId}`:

```json
{
"language": "fr"
}
```

## Supported languages

Transcription is currently available for these languages:

| Language | IETF Language Tag |
|----------|-------------------|
| Arabic | `ar` |
| Catalan | `ca` |
| Czech | `cs` |
| Danish | `da` |
| German | `de` |
| Greek | `el` |
| English | `en` |
| Spanish | `es` |
| Persian | `fa` |
| Finnish | `fi` |
| French | `fr` |
| Hebrew | `he` |
| Hindi | `hi` |
| Croatian | `hr` |
| Hungarian | `hu` |
| Italian | `it` |
| Japanese | `ja` |
| Korean | `ko` |
| Malayalam | `ml` |
| Dutch | `nl` |
| Norwegian Nynorsk | `nn` |
| Norwegian | `no` |
| Polish | `pl` |
| Portuguese | `pt` |
| Russian | `ru` |
| Slovak | `sk` |
| Slovenian | `sl` |
| Telugu | `te` |
| Turkish | `tr` |
| Ukrainian | `uk` |
| Urdu | `ur` |
| Vietnamese | `vi` |
| Chinese | `zh` |

## Next steps

- Learn more about the WebVTT format and managing captions in the [Captions guide](/vod/add-captions)
- Check out other AI-driven features like video translation and AI summary [in our blog](https://api.video/blog/product-updates/ai-video-features/)
12 changes: 7 additions & 5 deletions vod/navigation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,11 @@

- heading: Annotations & branding
items:
- label: Adding watermarks
- label: Transcripts
href: /vod/generate-transcripts.md
- label: Captions
href: /vod/add-captions.md
- label: Watermarks
href: /vod/add-a-permanent-watermark.md
- label: Creating and managing chapters
href: /vod/creating-and-managing-chapters.md
- label: Adding captions
href: /vod/add-captions.md
- label: Chapters
href: /vod/creating-and-managing-chapters.md

0 comments on commit 49b9043

Please sign in to comment.