-
Notifications
You must be signed in to change notification settings - Fork 8
MPEG H decoder parameters
The MPEG-H decoder parameters can be set with the function mpeghdecoder_setParam()
. Currently, all of these parameters are related to MPEG-D DRC.
In MPEG-H Audio, the MPEG-D Dynamic Range Control (DRC) standard provides tools for loudness and dynamic range control.
MPEG-D DRC is defined in ISO/IEC 23003-4:2020. Its integration into MPEG-H 3D Audio is defined in Section 6 of ISO/IEC 23008-3:2022.
For more information on MPEG-D DRC and MPEG-H Audio, please refer to the MPEG-D DRC White Paper and to section VI. of the MPEG-H Audio IEEE Paper.
Inside the MPEG-H bitstream, loudness metadata is transported by the syntax element mpegh3daLoudnessInfoSet()
, which is part of the static codec configuration, i.e. it is constant throughout a content item. The loudness metadata usually consists of at least one of the following loudness values:
- program loudness
- anchor loudness
- expert loudness
It can additionally contain album loudness values.
To normalize the loudness across content items, the MPEG-H decoder applies a gain to the audio output such that the output loudness matches the target loudness. This gain is derived as the difference between the target loudness and the content loudness:
loudness normalization gain = (target loudness in LKFS1) - (content loudness in LKFS)
The target loudness is controlled by the parameter MPEGH_DEC_PARAM_TARGET_REFERENCE_LEVEL
as described below.
Metadata for Dynamic Range Control (DRC) is transported in two parts inside the MPEG-H bitstream:
The uniDrcGain()
syntax element contains the dynamic DRC gain sequences, consisting of gain values that are changing over time.
The mpegh3daUniDrcConfig()
syntax element contains the static DRC configuration. It consists of so-called DRC sets, that define which DRC gain sequences are applied to the audio signal in which playback scenario. Moreover, DRC sets can define that only a scaled version of DRC gain sequences are applied to the signal.
There are three types of DRC sets, which are simultaneously applied2:
- For DRC: These DRC sets usually reduce the dynamic range by attenuating loud signal segments, and amplifying soft signal segments. They are selected by the parameter
MPEGH_DEC_PARAM_EFFECT_TYPE
. - For ducking: These DRC sets adapt the audio signal to specific playback scenarios. For instance, they attenuate background sounds during voice activity for improved intelligibility. They can also include preset-dependent, time-variant loudness leveling gains. DRC sets for ducking are usually selected dependent on the selected group preset. They are independent of the requested DRC effect type.
- For fading: These DRC sets provide fade-in and fade-out transitions for the songs of a gapless album if these are played back in arbitrary order or in a different context. The application of fading gains can be selected by the parameter
MPEGH_DEC_PARAM_ALBUM_MODE
.
MPEGH_DEC_PARAM_TARGET_REFERENCE_LEVEL
The target reference level is the target loudness to which the decoder normalizes the audio output.
The value is given as an integer value and is calculated as follows:
value = -4 * (target reference level in LKFS)
The value must be in the range between 40 and 127, representing the range of -10 to -31.75 LKFS.
Example values:
value | target loudness | application |
---|---|---|
124 | -31 LKFS | for audio/video receivers (AVR) or other devices allowing audio playback with high dynamic range |
96 | -24 LKFS | for TV sets or equivalent devices (default) |
64 | -16 LKFS | for mobile devices where the dynamic range of audio playback is restricted |
MPEGH_DEC_PARAM_EFFECT_TYPE
The DRC effect type controls the selection of a DRC set.
The supported indices are listed in the following table:
value | DRC effect type | short name | description |
---|---|---|---|
-1 | DRC off | Off | Disables DRC. Ducking and Fading gains, as well as loudness normalization, are still active |
0 | None (default) | None | Disables DRC, except for the case that DRC processing is necessary to prevent signal clipping |
1 | Late night | Night | For quiet environment, listening at low level, avoiding to disturb others |
2 | Noisy environment | Noisy | Optimized to get the best experience in noisy environments, for instance by amplifying soft sections |
3 | Limited playback range | Limited | Reduced dynamic range to improve quality on playback devices with limited dynamic range |
4 | Low playback level | LowLevel | Listening at a low playback level |
5 | Dialog enhancement | Dialog | The main effect is a more prominent dialogue within the content |
6 | General compression | General | For enabling MPEG-D DRC without particular DRC effect type request |
If there is no DRC set with the selected DRC effect type available in the DRC configuration, the most appropriate DRC set is automatically selected instead.
MPEGH_DEC_PARAM_BOOST_FACTOR
The DRC boost factor is a scaling factor that is applied to amplification DRC gains, i.e. to gains that are greater than 0 dB.
The value is given as an integer value and is calculated as follows:
value = 127 * (boost factor)
The value must be in the range between 0 and 127, representing the range of the factor of 0.0 (i.e. don't apply) to 1.0 (i.e. full application of amplification DRC gains).
The default value is 127 (full application of amplification DRC gains).
MPEGH_DEC_PARAM_ATTENUATION_FACTOR
The DRC attenuation factor is a scaling factor that is applied to attenuation DRC gains, i.e. to gains that are less than 0 dB.
The value is given as an integer value and is calculated as follows:
value = 127 * (attenuation factor)
The value must be in the range between 0 and 127, representing the range of the factor of 0.0 (i.e. don't apply) to 1.0 (i.e. full application of attenuation DRC gains).
The default value is 127 (full application of attenuation DRC gains).
MPEGH_DEC_PARAM_ALBUM_MODE
The album mode parameter should be set to enabled if songs of an album are played back in the original order of the album. The album mode parameter controls both the application of DRC sets for fading, and the usage of album loudness for normalization:
value | album mode | album loudness | fading gains |
---|---|---|---|
0 | disabled (default) | - | apply fading gains, if present |
1 | enabled | use album loudness, if present | - |
According to the ANSI/CTA-2075 standard, the default value of the target loudness and DRC effect type parameters should be set dependent on the available transducer SPL range, which usually corresponds to the playback device class. DRC effect types that an end-user may be able to select should be only a subset of all possible values as described in the following table:
device class | SPL range (CTA-2075) | target loudness | default DRC effect type | selectable DRC effect types |
---|---|---|---|---|
AVR | large | -31 | General | Off, Night, Noisy, General |
TV | medium | -24 | General | Off, Night, Noisy, General |
mobile | small | -16 | Limited | Limited, Noisy |
IMPORTANT |
---|
Before deploying an MPEG-H decoder, please change the default MPEG-H decoder parameters according to this section. |
All MPEG-H decoder parameters related to Loudness and Dynamic Range Control can be also set via MHAS packets of type PACTYP_LOUDNESS_DRC
. These MHAS packets can be inserted by the separate UI Manager module. In this case, the parameter values set via MHAS packets take precedence over the values set via the mpeghdecoder_setParam()
API function.
1: Note that LKFS and LUFS are equivalent units for loudness measurement. LKFS is specified in ITU-R BS.1770. LUFS is specified in EBU R 128. This document uses LKFS.↩
2: DRC sets either for ducking or for fading will be simultaneously applied to DRC sets for DRC. If both a DRC set for ducking and a DRC set for fading is selected, the DRC set for fading will be ignored.↩