Fix #821, updated version of Expanded Loudspeaker Layout (#831)

* Fix #821, updated version of Expanded Loudspeaker Layout * Follow reviewer's suggestion Co-authored-by: Felicia Lim <[email protected]> * Follow reviewer's suggestion Co-authored-by: Felicia Lim <[email protected]> * Follow reviewer's suggestion Co-authored-by: Felicia Lim <[email protected]> * Follow reviewer's suggestion Co-authored-by: Felicia Lim <[email protected]> * Follow reviewer's suggestion * Fix incorrect sentence --------- Co-authored-by: Felicia Lim <[email protected]>
AOMediaCodec · Jun 17, 2024 · 1feceaf · 1feceaf
1 parent 331a1cc
commit 1feceaf
Showing 1 changed file with 54 additions and 5 deletions.
diff --git a/index.bs b/index.bs
@@ -889,6 +889,8 @@ class ChannelAudioLayerConfig(i) {
     unsigned int (2) reserved;
     signed int (16) output_gain(i);
   }
+  if (i == 1 && [=loudspeaker_layout=] == 15)
+    unsigned int (8) expanded_loudspeaker_layout;
 }
 ```
 
@@ -902,11 +904,11 @@ class ChannelAudioLayerConfig(i) {
 
 <dfn noexport>loudspeaker_layout</dfn> indicates the channel layout to be reconstructed from the precedent [=Channel Group=]s and current [=Channel Group=]. If parsers do not recognize a [=loudspeaker_layout=] for a particular layer, they SHOULD skip the [=channel_audio_layer_config=] for that layer and all subsequent layers.
 
-In this version of the specification, [=loudspeaker_layout=] indicates one of the 10 channel layouts listed below.
+In this version of the specification, [=loudspeaker_layout=] indicates one of the channel layouts listed below.
 
 <table class="def">
 <tr>
-  <th><code>loudspeaker_layout</code></th><th>Channel Layout</th><th>Loudspeaker Location Ordering</th><th>Reference</th>
+  <th>loudspeaker_layout</th><th>Channel Layout</th><th>Loudspeaker Location Ordering</th><th>Reference</th>
 </tr>
 <tr>
   <td>0000</td><td>Mono</td><td>C</td><td></td>
@@ -939,8 +941,12 @@ In this version of the specification, [=loudspeaker_layout=] indicates one of th
   <td>1001</td><td>Binaural</td><td>L/R</td><td></td>
 </tr>
 <tr>
-  <td>others</td><td>Reserved</td><td></td><td></td>
+  <td>1010 ~ 1110</td><td>Reserved</td><td></td><td></td>
 </tr>
+<tr>
+  <td>1111</td><td>Expanded channel layouts</td><td></td><td>Loudspeaker configurations defined in the [=expanded_loudspeaker_layout=] field</td>
+</tr>
+
 </table>
 
 Where C: Center, L: Left, R: Right, Ls: Left Surround, Lss: Left Side Surround, Rs: Right Surround, Rss: Right Side Surround, Lrs: Left Rear Surround, Rrs: Right Rear Surround, Ltf: Left Top Front, Rtf: Right Top Front, Ltr: Left Top Rear, Rtr: Right Top Rear, Ltb: Left Top Back, Rtb: Right Top Back, LFE: Low-Frequency Effects
@@ -953,6 +959,8 @@ For a given input [=3D audio signal=] with [=audio_element_type=] = CHANNEL_BASE
 
 NOTE: This specification allows down-mixing mechanisms (e.g., as specified in [[#iamfgeneration-scalablechannelaudio-downmixmechanism]]) to drop the height channel if the output layout has no height channels. An example is down-mixing from 7.1.4ch to Mono, Stereo, 5.1ch or 7.1ch. Therefore, given an input [=3D audio signal=] with height channels, an encoder may generate a set of scalable audio channel groups with layouts that do not have height channels.
 
+For a given input [=3D audio signal=] with an expanded channel layout defined in [=expanded_loudspeaker_layout=], [=num_layers=] SHALL be set to 1 (i.e., it is a non-scalable channel audio element). It is RECOMMENDED to use such an [=Audio Element=] as an auxiliary [=Audio Element=] to be mixed with a primary [=Audio Element=] (e.g., TOA or 7.1.4ch) within a [=Mix Presentation=]. If parsers encounter a [=loudspeaker_layout=] = 15 for any layer other than the first layer, they SHOULD skip the [=channel_audio_layer_config=] for that layer and all subsequent layers.
+
 <dfn noexport>output_gain_is_present_flag</dfn> indicates if the output_gain information fields for the [=Channel Group=] are present.
 - 0: No output_gain information fields for the [=Channel Group=] are present.
 - 1: output_gain information fields for the [=Channel Group=] are present. In this case, [=output_gain_flags=] and [=output_gain=] fields are present.
@@ -973,7 +981,6 @@ The order of the [=Audio Substream=]s in each [=Channel Group=] is specified in
 
 <dfn noexport>output_gain_flags</dfn> indicates the channels which [=output_gain=] is applied to. If a bit is set to 1, [=output_gain=] SHALL be applied to the channel. Otherwise, [=output_gain=] SHALL NOT be applied to the channel.
 
-
 <pre class = "def">
 Bit position : Channel Name
     b5(MSB)  : Left channel (L1, L2, L3)
@@ -987,6 +994,44 @@ Bit position : Channel Name
 
 <dfn noexport>output_gain</dfn> indicates the gain value to be applied to the mixed channels which are indicated by [=output_gain_flags=], where each mixed channel is generated by down-mixing two or more input channels. It is computed as \(20 \times \log_{10}(f)\), where \(f\) is the factor by which to scale the mixed channels. It is stored as a 16-bit, signed, two’s complement fixed-point value with 8 fractional bits (i.e., Q7.8)([[Q-Format]]).
 
+<dfn noexport>expanded_loudspeaker_layout</dfn> indicates the expanded channel layout to be reconstructed from the [=Channel Group=]. This field SHALL only be present when [=num_layers=] = 1 and [=loudspeaker_layout=] is set to 15. Parsers SHOULD ignore [=Audio Element OBU=]s with an [=expanded_loudspeaker_layout=] that they do not recognize.
+
+In this version of the specification, [=expanded_loudspeaker_layout=] indicates one of the expanded channel layouts listed below.
+
+<table class="def">
+<tr>
+  <th>expanded_loudspeaker_layout</th><th>Expanded Channel Layout</th><th>Loudspeaker Location Ordering</th><th>Reference</th>
+</tr>
+<tr>
+  <td>0</td><td>LFE</td><td>LFE</td><td>The low-frequency effects subset (LFE) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>1</td><td>Stereo-S</td><td>Ls/Rs</td><td>The surround subset (Ls/Rs) of 5.1.4ch</td>
+</tr>
+<tr>
+  <td>2</td><td>Stereo-SS</td><td>Lss/Rss</td><td>The side surround subset (Lss/Rss) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>3</td><td>Stereo-RS</td><td>Lrs/Rrs</td><td>The rear surround subset (Lrs/Rrs) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>4</td><td>Stereo-TF</td><td>Ltf/Rtf</td><td>The top front subset (Ltf/Rtf) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>5</td><td>Stereo-TB</td><td>Ltb/Rtb</td><td>The top back subset (Ltb/Rtb) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>6</td><td>Top-4.0ch</td><td>Ltf/Rtf/Ltb/Rtb</td><td>The top 4 channels (Ltf/Rtf/Ltb/Rtb) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>7</td><td>3.0ch</td><td>L/C/R</td><td>The front 3 channels (L/C/R) of 7.1.4ch</td>
+</tr>
+<tr>
+  <td>8 ~ 255</td><td>Reserved</td><td></td><td></td>
+</tr>
+
+</table>
+
 ### Scalable Channel Group and Layout ### {#scalalechannelaudio-channelgroupandlayout}
 
 When an [=Audio Element=] is composed of \(G(r)\) number of [=Audio Substream=]s, its scalable channel audio representation is layered into \(r\) [=num_layers=] of [=Channel Group=]s.
@@ -2359,7 +2404,11 @@ In this section, for a given x.y.z layout, the next highest layout x'.y'.z' mean
 This section defines the renderer to use, given a channel-based [=Audio Element=] and a loudspeaker playback layout.
 
 - The input layout (x.y.z) of the IA renderer is set as follows:
-    - If [=num_layers=] = 1, use the [=loudspeaker_layout=] of the [=Audio Element=].
+    - If [=num_layers=] = 1, 
+        - If [=loudspeaker_layout=] < 10, use the [=loudspeaker_layout=] of the [=Audio Element=].
+        - Else if [=loudspeaker_layout=] = 15, 
+            - If [=expanded_loudspeaker_layout=] = 1, use 5.1.4ch with empty channels everywhere other than the corresponding loudspeaker locations.
+            - Else, use 7.1.4ch with empty channels everywhere other than the corresponding loudspeaker locations.
     - Else, if the [=Audio Element=] has a [=loudspeaker_layout=] that matches the playback layout, use that matching [=loudspeaker_layout=].
     - Else, use the next highest available layout from all available [=loudspeaker_layout=]s.
 - The output layout of the IA renderer is set to the playback layout (X.Y.Z).