diff --git a/index.bs b/index.bs index fad5a05..1d23843 100755 --- a/index.bs +++ b/index.bs @@ -79,118 +79,137 @@ Metadata Order: This version, !*, *
url: https://www.iso.org/standard/66067.html; spec: HEIF; type: dfn; - text: colr + text: aux_type + text: auxC + text: AuxiliaryTypeInfoBox + text: AuxiliaryTypeProperty + text: auxl + text: bits_per_channel + text: cdsc + text: cmex + text: cmin + text: derived image item + text: dimg + text: grid + text: image_height + text: image_width + text: imir + text: irot + text: ispe + text: layer_id + text: lsel text: mif1 text: msf1 - text: pasp + text: ndwt text: pict + text: PixelInformationProperty text: pixi - text: ispe - text: lsel - text: irot - text: imir - text: clap - text: cclv - text: clli - text: mdcv + text: prem text: reve - text: amve - text: ndwt - text: cmin - text: cmex - text: dimg - text: layer_id - text: image_width - text: image_height text: ster + text: thmb text: tmap - text: derived image item - text: aux_type - text: AuxiliaryTypeInfoBox - text: AuxiliaryTypeProperty - text: bits_per_channel - text: PixelInformationProperty url: https://www.iso.org/standard/68960.html; spec: ISOBMFF; type: dfn; + text: altr + text: amve + text: cclv + text: clap + text: clli + text: colour_type + text: ColourInformationBox + text: colr text: compatible_brands + text: ContentLightLevelBox + text: dinf + text: dref text: FileTypeBox - text: major_brand - text: SingleItemTypeReferenceBox - text: SingleItemTypeReferenceBoxLarge - text: ItemReferenceBox - text: reference_count + text: free text: from_item_ID - text: to_item_ID - text: nclx - text: sync - text: iloc - text: mdat + text: ftyp + text: full_range_flag + text: GroupsListBox + text: grpl + text: hdlr text: idat - text: altr + text: iinf + text: iloc + text: infe + text: ipco + text: ipma + text: iprp + text: iref + text: ItemReferenceBox + text: major_brand + text: MasteringDisplayColourVolumeBox text: matrix_coefficients - text: full_range_flag - text: colour_type + text: mdat + text: mdcv text: meta - text: free + text: nclx + text: pasp + text: pitm + text: reference_count + text: SingleItemTypeReferenceBox + text: SingleItemTypeReferenceBoxLarge text: skip - text: ItemPropertyContainerBox - text: MasteringDisplayColourVolumeBox - text: ContentLightLevelBox + text: sync + text: to_item_ID url: https://www.iso.org/standard/74417.html; spec: MIAF; type: dfn; + text: edit-lists + text: grid-limit + text: matched-duration text: miaf - text: primary image item - text: MIAF image item - text: MIAF image sequence text: MIAF auxiliary image item text: MIAF auxiliary image sequence + text: MIAF image item + text: MIAF image sequence + text: primary image item text: self-containment - text: grid-limit text: single-track - text: edit-lists - text: matched-duration url: https://aomediacodec.github.io/av1-isobmff/; spec: AV1-ISOBMFF; type: dfn; - text: AV1CodecConfigurationBox text: AV1 Sample text: AV1 Track + text: AV1CodecConfigurationBox url: https://aomediacodec.github.io/av1-spec/av1-spec.pdf; spec: AV1; type: dfn; text: AV1 bitstream text: AV1 Frame - text: Sequence Header OBU - text: Metadata OBU - text: Temporal Unit - text: Operating Point + text: choose_operating_point + text: color_range + text: FrameHeight text: Intra Frame + text: max_frame_height_minus1 + text: max_frame_width_minus1 + text: Metadata OBU text: mono_chrome - text: color_range - text: still_picture - text: reduced_still_picture_header + text: Operating Point text: operating_points_cnt_minus_1 - text: choose_operating_point - text: spatial_id - text: seq_level_idx - text: render_width_minus1 + text: reduced_still_picture_header text: render_height_minus1 + text: render_width_minus1 + text: seq_level_idx + text: Sequence Header OBU + text: spatial_id + text: still_picture + text: Temporal Unit text: UpscaledWidth - text: FrameHeight - text: max_frame_width_minus1 - text: max_frame_height_minus1
[=AV1ItemConfigurationProperty=]
.[=AV1ItemConfigurationProperty=]
, it shall match the [=Sequence Header OBU=] in the [=AV1 Image Item Data=].[=AV1ItemConfigurationProperty=]
shall match those of the [=Sequence Header OBU=] in the [=AV1 Image Item Data=].[=AV1ItemConfigurationProperty=]
shall match the [=PixelInformationProperty=]
if present.[=MasteringDisplayColourVolumeBox=]
or [=ContentLightLevelBox=]
.
+ - [=AV1ItemConfigurationProperty=]
shall match the [=PixelInformationProperty=]
('[=pixi=]'
) if present.[=MasteringDisplayColourVolumeBox=]
('[=mdcv=]'
) or [=ContentLightLevelBox=]
('[=clli=]'
).
'[=ispe=]'
property as defined in [[!HEIF]] apply. More specifically, for [[!AV1]] images, [=image_width=]
and [=image_height=]
shall respectively equal the values of [=UpscaledWidth=]
and [=FrameHeight=]
'[=lsel=]'
and [=OperatingPointSelectorProperty=]
properties as follows:
- In the absence of a '[=lsel=]'
property associated with the item, or if it is present and its [=layer_id=]
value is set to 0xFFFF:
- - If no [=OperatingPointSelectorProperty=]
is associated with the item, the '[=ispe=]'
property shall document the dimensions of the last frame decoded when processing the operating point whose index is 0[=OperatingPointSelectorProperty=]
is associated with the item, the '[=ispe=]'
property shall document the dimensions of the last frame decoded when processing the [=operating point=] whose index is 0[=OperatingPointSelectorProperty=]
is associated with the item, the '[=ispe=]'
property shall document the dimensions of the last frame decoded when processing the corresponding operating point[=OperatingPointSelectorProperty=]
is associated with the item, the '[=ispe=]'
property shall document the dimensions of the last frame decoded when processing the corresponding [=operating point=]'[=ispe=]'
property. If they display these intermediate images, renderers are expected to scale the output image to match the '[=ispe=]'
property.
+ NOTE: The dimensions of possible intermediate output images might not match the ones given in the '[=ispe=]'
property. If renderers display these intermediate images, they are expected to scale the output image to match the '[=ispe=]'
property.
- If a '[=lsel=]'
property is associated with an item and its [=layer_id=]
is different from 0xFFFF, the '[=ispe=]'
property documents the dimensions of the output frame produced by decoding the corresponding layer.
NOTE: The dimensions indicated in the '[=ispe=]'
property might not match the values [=max_frame_width_minus1=]+1
and [=max_frame_height_minus1=]+1
indicated in the AV1 bitstream.
-NOTE: The values of [=render_width_minus1=]
and [=render_height_minus1=]
possibly present in the AV1 bistream are not exposed at the AVIF container level.
+NOTE: The values of [=render_width_minus1=]
and [=render_height_minus1=]
possibly present in the AV1 bistream are not exposed at the [=/AVIF=] container level.
'[=clap=]'
) as define
'[=colr=]'
- - '[=pixi=]'
- - '[=pasp=]'
- - '[=irot=]'
- - '[=imir=]'
- - '[=clli=]'
- - '[=cclv=]'
- - '[=mdcv=]'
- - '[=amve=]'
- - '[=reve=]'
- - '[=ndwt=]'
- - '[=cmin=]'
- - '[=cmex=]'
-
-In general, it is recommended to use properties instead of [=Metadata OBUs=] in the [=AV1ItemConfigurationProperty=]
.
+In addition to the Image Properties defined in this document, [=AV1 image items=] may also be associated with item properties defined in other specifications such as [[!HEIF]] and [[!MIAF]]. Commonly used item properties can be found in [[#avif-required-boxes]] and [[#avif-required-boxes-additional]].
+
+In general, it is recommended to use item properties instead of [=Metadata OBUs=] in the [=AV1ItemConfigurationProperty=]
.
choose_operating_point()
. AVIF defines the [=OperatingPointSelectorProperty=]
to control this selection. In the absence of an [=OperatingPointSelectorProperty=]
associated with an [=AV1 Image Item=], the AVIF renderer is free to process any [=Operating Point=] present in the [=AV1 Image Item Data=]. In particular, [=OperatingPointSelectorProperty=]
should not be present[=OperatingPointSelectorProperty=]
is associated with an [=AV1 Image Item=], the [=op_index=]
field indicates which [=Operating Point=] is expected to be processed for this item.
+[[!AV1]] delegates the selection of which [=Operating Point=] to process to the application, by means of a function called choose_operating_point()
. [=/AVIF=] defines the [=OperatingPointSelectorProperty=]
to control this selection. In the absence of an [=OperatingPointSelectorProperty=]
associated with an [=AV1 Image Item=], the [=/AVIF=] renderer is free to process any [=Operating Point=] present in the [=AV1 Image Item Data=]. In particular, [=OperatingPointSelectorProperty=]
should not be present[=OperatingPointSelectorProperty=]
is associated with an [=AV1 Image Item=], the [=op_index=]
field indicates which [=Operating Point=] is expected to be processed for this item.
-NOTE: When an author wants to offer the ability to render multiple [=Operating Points=] from the same AV1 image (e.g. in the case of multi-view images), multiple [=AV1 Image Items=] can be created that share the same [=AV1 Image Item Data=] but have different [=OperatingPointSelectorProperty=]
s.
+NOTE: When an author wants to offer the ability to render multiple [=Operating Points=] from the same AV1 image (e.g. in the case of multi-view images), multiple [=AV1 Image Items=] can be created that share the same [=AV1 Image Item Data=] but have different [=OperatingPointSelectorProperties=]
.
-[[!AV1]] expects the renderer to display only one frame within the selected [=Operating Point=], which should be the highest spatial layer that is both within the [=Operating Point=] and present within the temporal unit, but [[!AV1]] leaves the option for other applications to set their own policy about which frames are output, as defined in the general output process. AVIF sets a different policy, and defines how the '[=lsel=]'
property (mandated by [[!HEIF]] for layered images) is used to control which layer is rendered. According to [[!HEIF]], the interpretation of the [=layer_id=]
field in the '[=lsel=]'
property is codec specific. In this specification, the value 0xFFFF is reserved for a special meaning. If a '[=lsel=]'
property is associated with an [=AV1 Image Item=] but its [=layer_id=]
value is set to 0xFFFF, the renderer is free to render either only the output image of the highest spatial layer, or to render all output images of all the intermediate layers and the highest spatial layer, resulting in a form of progressive decoding. If a '[=lsel=]'
property is associated with an [=AV1 Image Item=] and the value of [=layer_id=]
is not 0xFFFF, the renderer is expected to render only the output image for that layer.
+[[!AV1]] expects the renderer to display only one frame within the selected [=Operating Point=], which should be the highest spatial layer that is both within the [=Operating Point=] and present within the temporal unit, but [[!AV1]] leaves the option for other applications to set their own policy about which frames are output, as defined in the general output process. [=/AVIF=] sets a different policy, and defines how the '[=lsel=]'
property (mandated by [[!HEIF]] for layered images) is used to control which layer is rendered. According to [[!HEIF]], the interpretation of the [=layer_id=]
field in the '[=lsel=]'
property is codec specific. In this specification, the value 0xFFFF is reserved for a special meaning. If a '[=lsel=]'
property is associated with an [=AV1 Image Item=] but its [=layer_id=]
value is set to 0xFFFF, the renderer is free to render either only the output image of the highest spatial layer, or to render all output images of all the intermediate layers and the highest spatial layer, resulting in a form of progressive decoding. If a '[=lsel=]'
property is associated with an [=AV1 Image Item=] and the value of [=layer_id=]
is not 0xFFFF, the renderer is expected to render only the output image for that layer.
NOTE: When such a progressive decoding of the layers within an [=Operating Point=] is not desired or when an author wants to expose each layer as a specific item, multiple [=AV1 Image Items=] sharing the same [=AV1 Image Item Data=] can be created and associated with different '[=lsel=]'
properties, each with a different value of [=layer_id=]
.
@@ -301,7 +307,7 @@ NOTE: When such a progressive decoding of the layers within an [=Operating Point
[=operating_points_cnt_minus_1=]
inclusive.[=operating_points_cnt_minus_1=]
inclusive.An AV1 Alpha Image Item (respectively an AV1 Alpha Image Sequence) is an [=AV1 Auxiliary Image Item=] (respectively an [=AV1 Auxiliary Image Sequence=]), and as defined in [[!MIAF]], with the [=aux_type=]
field of the [=AuxiliaryTypeProperty=]
(respectively [=AuxiliaryTypeInfoBox=]
) set to urn:mpeg:mpegB:cicp:systems:auxiliary:alpha
.
[=ColourInformationBox=]
('[=colr=]'
) should be omitted.
An AV1 Depth Image Item (respectively an AV1 Depth Image Sequence) is an [=AV1 Auxiliary Image Item=] (respectively an [=AV1 Auxiliary Image Sequence=]), and as defined in [[!MIAF]], with the [=aux_type=]
field of the [=AuxiliaryTypeProperty=]
(respectively [=AuxiliaryTypeInfoBox=]
) set to urn:mpeg:mpegB:cicp:systems:auxiliary:depth
.
'[=grid=]'
) as defined in [[!HEIF]] may be used in an [=AVIF file=].
+
'[=tmap=]'
) as defined in [[!HEIF]] may be used in an [=AVIF=] file. '[=tmap=]'
image item should be grouped together by an '[=AVIF/altr=]' entity group as recommended in [[!HEIF]].'[=tmap=]'
) as defined in [[!HEIF]] may be used in an [=AVIF file=]. '[=tmap=]'
image item should be grouped together by an '[=altr=]'
(see [[#altr-group]]) entity group as recommended in [[!HEIF]].[=sato/reserved=]
shall be ignored by readers.
-bit_depth determines the precision (from 8 to 64 bits, see Table 1) of the signed integer temporary variable supporting the intermediate results of the operations. It also determines the precision of the stack elements and the field size of the [=sato/constant=]
fields. This intermediate precision shall be high enough so that all input sample values fit into that signed bit depth.
+bit_depth determines the precision (from 8 to 64 bits, see Table 1) of the signed integer temporary variable supporting the intermediate results of the operations. It also determines the precision of the stack elements and the field size of the [=sato/constant=]
fields. This intermediate precision shall be high enough so that all input sample values fit into that signed bit depth.
-ftyp | +[=ftyp=] | - | -ISOBMFF | +[[!ISOBMFF]] | |||||
meta | +[=meta=] | 0 | -ISOBMFF | +[[!ISOBMFF]] | |||||
- | hdlr | +[=hdlr=] | 0 | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | pitm | +[=pitm=] | 0, 1 | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | iloc | +[=iloc=] | 0, 1, 2 | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | iinf | +[=iinf=] | 0, 1 | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | infe | +[=infe=] | 2, 3 | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | iprp | +[=iprp=] | - | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | ipco | +[=ipco=] | - | -ISOBMFF | +[[!ISOBMFF]] | ||||
- | av1C | +[=/av1C=] | - | -AVIF | +[=/AVIF=] | ||||
- | ispe | +[=ispe=] | 0 | -HEIF | +[[!HEIF]] | ||||
- | pixi | +[=pixi=] | 0 | -HEIF | +[[!HEIF]] | ||||
- | ipma | +[=ipma=] | 0, 1 | -ISOBMFF | +[[!ISOBMFF]] | ||||
mdat | +[=mdat=] | - | -ISOBMFF | +[[!ISOBMFF]] | The coded payload may be placed in '[=idat=]' rather than '[=mdat=]' , in which case '[=mdat=]' is not required. |
dinf | +[=dinf=] | - | -ISOBMFF | -meta | +[[!ISOBMFF]] | +[=meta=] | Used to indicate the location of the media information in a track | ||
- | dref | +[=dref=] | 0 | -ISOBMFF | +[[!ISOBMFF]] | ||||
iref | +[=iref=] | 0, 1 | -ISOBMFF | -meta | +[[!ISOBMFF]] | +[=meta=] | Used to indicate directional relationships between images or metadata | ||
- | auxl | +[=auxl=] | - | -HEIF | +[[!HEIF]] | Used when an image is auxiliary to another image | |||
- | thmb | +[=thmb=] | - | -HEIF | +[[!HEIF]] | Used when an image is a thumbnail of another image | |||
- | dimg | +[=dimg=] | - | -HEIF | +[[!HEIF]] | - | Used when an image is derived from another image | +Used when an image is [[#derived-images|derived from another image]] | |
- | prem | +[=prem=] | - | -HEIF | +[[!HEIF]] | - | Used when when an alpha image contains premultiplied color values from another image | +Used when the color values in an image have been premultiplied with alpha values | |
- | cdsc | +[=cdsc=] | - | -HEIF | +[[!HEIF]] | Used to link metadata with an image | |||
idat | +[=idat=] | - | -ISOBMFF | -meta | +[[!ISOBMFF]] | +[=meta=] | Used to store derived image definitions | ||
[=AVIF/grpl=] | +[[#groups|grpl]] | - | -ISOBMFF | -meta | +[[!ISOBMFF]] | +[=meta=] | Used to indicate that multiple images are semantically grouped | ||
- | [=AVIF/altr=] | +[[#altr-group|altr]] | 0 | -ISOBMFF | +[[!ISOBMFF]] | - | Used when images in a group are alternative to each other | +Used when images in a group are alternatives to each other | |
- | [=AVIF/ster=] | +[[#ster-group|ster]] | 0 | -HEIF | +[[!HEIF]] | Used when images in a group form a stereo pair | |||
pasp | +[=pasp=] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal pixel aspect ratio. If present, shall indicate a pixel aspect ratio of 1:1 | ||
colr | +[=colr=] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal information such as color primaries. | ||
auxC | +[=auxC=] | 0 | -HEIF | -ipco | +[[!HEIF]] | +[=ipco=] | Used to signal the type of an auxiliary image (e.g. alpha, depth). | ||
clap | +[[#clean-aperture-property|clap]] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal cropping applied to an image | ||
irot | +[=irot=] | - | -HEIF | -ipco | +[[!HEIF]] | +[=ipco=] | Used to signal a rotation applied to an image | ||
imir | +[=imir=] | - | -HEIF | -ipco | +[[!HEIF]] | +[=ipco=] | Used to signal a mirroring applied to an image | ||
clli | +[=clli=] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal HDR light level information for an image | ||
cclv | +[=cclv=] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal HDR color volume for an image | ||
mdcv | +[=mdcv=] | - | -ISOBMFF | -ipco | +[[!ISOBMFF]] | +[=ipco=] | Used to signal HDR mastering information for an image | ||
a1op | +[=amve=] | - | -AVIF | -ipco | -Used to configure rendering of a multiple operating points image | +[[!ISOBMFF]] | +[=ipco=] | +Used to signal the nominal ambient viewing environment for the display of the content | |
lsel | +[=reve=] | ++ | 0 | +[[!HEIF]] | +[=ipco=] | +Used to signal the viewing environment in which the image was mastered | +|||
[=ndwt=] | ++ | 0 | +[[!HEIF]] | +[=ipco=] | +Used to signal the nominal diffuse white luminance of the content | +||||
[=a1op=] | ++ | - | +[=/AVIF=] | +[=ipco=] | +Used to configure which operating point to select when there are multiple choices | +||||
[=lsel=] | - | -HEIF | -ipco | +[[!HEIF]] | +[=ipco=] | Used to configure rendering of a multilayered image | |||
a1lx | +[=a1lx=] | - | -AVIF | -ipco | +[=/AVIF=] | +[=ipco=] | Used to assist reader in parsing a multilayered image | ||
[=cmin=] | ++ | 0 | +[[!HEIF]] | +[=ipco=] | +Used to signal the camera intrinsic matrix | +||||
[=cmex=] | ++ | 0 | +[[!HEIF]] | +[=ipco=] | +Used to signal the camera extrinsic matrix | +
'[=AVIF/altr=]'
group with the [=Sample Transform Derived Image Item=], the first input image item is also a backward-compatible 8-bit regular coded image item that can be used by readers that do not support [=Sample Transform Derived Image Items=] or do not need extra precision.
+NOTE: If the first input image item is the [=primary image item=] and is enclosed in an '[=altr=]'
group (see [[#altr-group]]) with the [=Sample Transform Derived Image Item=], the first input image item is also a backward-compatible 8-bit regular coded image item that can be used by readers that do not support [=Sample Transform Derived Image Items=] or do not need extra precision.
NOTE: The second input image item loses its meaning of least significant part if any of the most significant bits changes, so the first input image item has to be losslessly encoded. The second input image item supports reasonable loss during encoding.
@@ -1277,7 +1328,7 @@ Consider the following:
This is equivalent to the following postfix notation (parentheses for clarity):
-
+
This is equivalent to the following infix notation:
@@ -1285,7 +1336,7 @@ This is equivalent to the following infix notation:
Each output sample is equal to the sum of a sample of the first input image item shifted to the left by 4 and of a sample of the second input image item offset by -128. This can be viewed as a bit depth extension of the first input image item by the second input image item which contains the residuals to correct the precision loss of the first input image item.
-NOTE: If the first input image item is the [=primary image item=] and is enclosed in an '[=AVIF/altr=]'
group with the derived image item, the first input image item is also a backward-compatible 12-bit regular coded image item that can be used by decoding contexts that do not support [=Sample Transform Derived Image Items=] or do not need extra precision.
+NOTE: If the first input image item is the [=primary image item=] and is enclosed in an '[=altr=]'
group (see [[#altr-group]]) with the derived image item, the first input image item is also a backward-compatible 12-bit regular coded image item that can be used by decoding contexts that do not support [=Sample Transform Derived Image Items=] or do not need extra precision.
NOTE: The first input image item supports reasonable loss during encoding because the second input image item "overlaps" by 4 bits to correct the loss. The second input image item supports reasonable loss during encoding.