Captionator has experimental support for HTML5 video and Audio tracks (designed both for assistive purposes, and for enriching existing media.)
Some use cases for additional audio & video tracks might include:
- Directors commentary
- Sign language picture-in-picture video
- Audio description for blind / vision impaired users
- Alternate video angle (useful for concerts etc)
- Alternate video track for a conference, with the slides
- (For movie special features) - video track showing storyboards
Captionator allows you to make use of this additional media without writing infinity-million lines of code.
Caveat: This stuff is totally non-standard. I'll be pushing for an implementation similar to this one, and will adjust captionator to mimic any standards which are drafted, but for the time being, I have to invent how this should work!
Rationale: It's important to provide a way of including and manipulating out-of-band (and in-band, but I can't do that with JS) media tracks, including, but not limited to those use cases mentioned above. I think that providing an implementation as close as possible to the current TextTrack draft is a good idea, as much of the TextTrack implementation philosophy works with respect to media tracks, and keeping things consistent reduces confusion and the learning curve, and provides greater opportunities to integrate the two APIs later down the track. (No pun intended!)
##The long and short of it##
<track kind="audiodescription" src="audio/audiodescription-en.wav" type="audio/wav" srclang="en" label="English Descriptive Audio" />
<track kind="audiodescription" src="audio/audiodescription-ja.wav" type="audio/wav" srclang="ja" label="Japanese Descriptive Audio" />
<track kind="commentary" src="audio/directorscommentary-en.wav" type="audio/wav" srclang="en" label="Director's Commentary" />
<track kind="alternate" src="video/storyboards.ogv" type="video/ogg" srclang="en" label="Storyboards" />
<track kind="signlanguage" src="video/signlanguage.ogv" type="video/ogg" srclang="en" label="Sign Language" />
Essentially, Captionator provides Media Track support in much the same way it provides text track support - through the <track>
element. Both the way it interprets the MediaTrack
<track>
elements and the API it provides to manipulate them is subtly different, though based around the same principles and ideas.
Instead of providing a .track
property on HTMLVideoElement
objects, Captionator instead puts these elements in a very similar property called mediaTracks
, containing MediaTrack
objects instead of TextTrack
objects. The reasons for this are twofold:
- Because of its non-standard implementation, Captionator endeavours to separate it from the standards-based
TextTrack
API (while keeping its implementation as similar to theTextTrack
API as possible.) - Developers may not necessarily want to see
MediaTrack
objects in the track list when they're expectingTextTrack
objects. Keeping the two types separate means it's easier to loop through and parse each type of track. If you want to seeMediaTrack
objects, you just look in the.mediaTrack
property instead.
Many of the properties of these tracks are the same, though:
label
- String - describes the track (in plain human language)language
- BCP47 language string which describes the trackkind
- Resource type (one ofaudiodescription
,commentary
,alternate
,signlanguage
.)mode
- the most important property (probably!) - determines whether captionator will fetch and render the resource.readyState
- indicates whether the resource is loaded (one of NONE/0, LOADING/1, LOADED/2, or ERROR/3)videoNode
- the HTMLVideoElement which the track relates to/extends. (Not in the WHATWG spec.)
The MediaTrack
object has some extensions on this basic set:
mediaElement
- The audio or video element responsible for displaying the MediaTrack itselftype
- The MIME type of the element
Captionator will automatically sync MediaTrack
objects to the playback of your master media element, prioritising their display according to whether you have enabled them or not.
Essentially, this procedure works the same way enabling and disabling regular TextTrack
objects does:
myVideo.mediaTracks[2].mode = 2; // SHOWING (Either video is visible or audio is audible)
myVideo.mediaTracks[2].mode = 1; // HIDDEN (Elements are not visible or audible)
myVideo.mediaTracks[2].mode = 0; // OFF
Setting MediaTracks as Showing By Default
For now, this can only be done in markup (I don't think there's any point in doing this at runtime in JavaScript, because there's no functional difference to just changing the track mode.) It's pretty straightforward - just add the boolean attribute 'default'.
<track kind="commentary" src="audio/directorscommentary-en.wav" type="audio/wav" srclang="en" label="Director's Commentary" default />
Captionator implements a number of different track types in extension to the ones defined by the WHATWG spec. Below are the MediaTrack specific extensions, what they are, and how you should use them.
-
audiodescription
Audio Only. Refers specifically to Audio Descriptions provided for the express purpose of explaining visual content in a video to people with vision impairments. Do not use this track for other forms of audio.
Captionator will ensure
audiodescription
track audio is audible and synced to the playback state of the master element, but will not display an independent interface for controlling the audio (ala thecontrols
attribute.) -
commentary
Audio Only. Use for additional commentary, which enriches the original video (such as a Director's commentary for a movie) but is not required for accessibility purposes. Do not use this track type to provide assistive features.
Captionator will ensure
commentary
track audio is audible and synced to the playback state of the master element, but will not display an independent interface for controlling the audio (ala thecontrols
attribute.) -
alternate
Audio & Video. Provide additional (alternate) audio and video tracks exclusively for enriching the user experience, not providing assistive features. Alternate angles and views, different soundtracks, etc. are all good uses of this track type. Here are some more examples:
- Providing other camera angles for sporting events or concerts
- Alternate soundtracks, or isolating individual instruments in a music video
- Showing slides or other video material as part of a presentation, where the primary media element is a video/audio recording of the presenter
- A video track showing storyboards timed to the movie playing in the master media element
Remember that you should provide additional assistive tracks for any 'content enrichment' (I just made that up - but you know what I mean!) tracks you create.
Be aware that Captionator will mute the audio of the master media element if an audio alternate track is selected to play. Multiple audio tracks may play at once, but it may sound awful! This does not apply with video - alternate video tracks will play without silencing the audio of their master media element. For this reason, you should avoid encoding audio into them.
Captionator will display alternate video over the master element, obscuring its contents. Audio is audible and synced to the playback state of the master element, but no audio interface is displayed.
-
signlanguage
Video Only. A video of a person providing a sign-language simultaneous translation of the content playing in the master media element.
Captionator will render this video as picture-in-picture, with the video taking up approximately a quarter of the available area in the master video. For this reason, you should ensure that even at small sizes, your
signLanguage
track is clear and free of visual distraction, and that the person signing takes up as much of the frame as possible (as long as you can still see all the gestures!)