diff --git a/index.bs b/index.bs index 7a2e55bd..ce557f18 100644 --- a/index.bs +++ b/index.bs @@ -932,7 +932,7 @@ In this version of the specification, [=loudspeaker_layout=] indicates one of th 00115.1.2chL/C/R/Ls/Rs/Ltf/Rtf/LFE[=Loudspeaker configuration for Sound System C (2+5+0)=] of [[!ITU-2051-3]] - 01005.1.4chL/C/R/Ls/Rs/Ltf/Rtf/Ltr/Rtr/LFE[=Loudspeaker configuration for Sound System D (4+5+0)=] of [[!ITU-2051-3]] + 01005.1.4chL/C/R/Ls/Rs/Ltf/Rtf/Ltr/Rtr/LFE[=Loudspeaker configuration for Sound System D (4+5+0)=] of [[!ITU-2051-3]] 01017.1chL/C/R/Lss/Rss/Lrs/Rrs/LFE[=Loudspeaker configuration for Sound System I (0+7+0)=] of [[!ITU-2051-3]] @@ -968,8 +968,6 @@ For a given input [=3D audio signal=] with [=audio_element_type=] = CHANNEL_BASE NOTE: This specification allows down-mixing mechanisms (e.g., as specified in [[#iamfgeneration-scalablechannelaudio-downmixmechanism]]) to drop the height channel if the output layout has no height channels. An example is down-mixing from 7.1.4ch to Mono, Stereo, 5.1ch or 7.1ch. Therefore, given an input [=3D audio signal=] with height channels, an encoder may generate a set of scalable audio channel groups with layouts that do not have height channels. -For a given input [=3D audio signal=] with an expanded channel layout defined in [=expanded_loudspeaker_layout=], [=num_layers=] SHALL be set to 1 (i.e., it is a non-scalable channel audio element). It is RECOMMENDED to use such an [=Audio Element=] as an auxiliary [=Audio Element=] to be mixed with a primary [=Audio Element=] (e.g., TOA or 7.1.4ch) within a [=Mix Presentation=]. If parsers encounter a [=loudspeaker_layout=] = 15 for any layer other than the first layer, they SHOULD skip the [=channel_audio_layer_config=] for that layer and all subsequent layers. - output_gain_is_present_flag indicates if the output_gain information fields for the [=Channel Group=] are present. - 0: No output_gain information fields for the [=Channel Group=] are present. - 1: output_gain information fields for the [=Channel Group=] are present. In this case, [=output_gain_flags=] and [=output_gain=] fields are present. @@ -983,8 +981,8 @@ For a given input [=3D audio signal=] with an expanded channel layout defined in coupled_substream_count specifies the number of referenced [=Audio Substream=]s, each of which is coded as coupled stereo channels. Each pair of [=Coupled stereo channels|coupled stereo channels=] in the same [=Channel Group=] SHALL be coded in stereo mode to generate one single coded [=Audio Substream=], also referred to as a coupled substream. Each [=Non-coupled channels|non-coupled channel=] in the same [=Channel Group=] SHALL be coded in mono mode to generate one single coded [=Audio Substream=], also known as a non-coupled substream. -- Coupled stereo channels: L/R, Ls/Rs, Lss/Rss, Lrs/Rrs, Ltf/Rtf, Ltb/Rtb -- Non-coupled channels: C, LFE, L +- Coupled stereo channels: L/R, Ls/Rs, Lss/Rss, Lrs/Rrs, Ltf/Rtf, Ltb/Rtb, FLc/FRc, FL/FR, SiL/SiR, BL/BR, TpFL/TpFR, TpSiL/TpSiR, TpBL/TpBR +- Non-coupled channels: C, LFE, L, FC, LFE1 The order of the [=Audio Substream=]s in each [=Channel Group=] is specified in [[#scalablechannelaudio-orderingofaudiosubstreamidentifiers]]. @@ -1012,35 +1010,58 @@ In this version of the specification, [=expanded_loudspeaker_layout=] indicates expanded_loudspeaker_layoutExpanded Channel LayoutLoudspeaker Location OrderingReference - 0LFELFEThe low-frequency effects subset (LFE) of 7.1.4ch + 0LFELFEThe low-frequency effects subset (LFE) of [=7.1.4ch=] + + + 1Stereo-SLs/RsThe surround subset (Ls/Rs) of [=5.1.4ch=] + + + 2Stereo-SSLss/RssThe side surround subset (Lss/Rss) of [=7.1.4ch=] + + + 3Stereo-RSLrs/RrsThe rear surround subset (Lrs/Rrs) of [=7.1.4ch=] - 1Stereo-SLs/RsThe surround subset (Ls/Rs) of 5.1.4ch + 4Stereo-TFLtf/RtfThe top front subset (Ltf/Rtf) of [=7.1.4ch=] - 2Stereo-SSLss/RssThe side surround subset (Lss/Rss) of 7.1.4ch + 5Stereo-TBLtb/RtbThe top back subset (Ltb/Rtb) of [=7.1.4ch=] - 3Stereo-RSLrs/RrsThe rear surround subset (Lrs/Rrs) of 7.1.4ch + 6Top-4chLtf/Rtf/Ltb/RtbThe top 4 channels (Ltf/Rtf/Ltb/Rtb) of [=7.1.4ch=] - 4Stereo-TFLtf/RtfThe top front subset (Ltf/Rtf) of 7.1.4ch + 73.0chL/C/RThe front 3 channels (L/C/R) of [=7.1.4ch=] - 5Stereo-TBLtb/RtbThe top back subset (Ltb/Rtb) of 7.1.4ch + 89.1.6ch[=Loudspeaker location ordering of 9.1.6ch=]The subset of [=Loudspeaker configuration for Sound System H (9+10+3)=] of [[!ITU-2051-3]] - 6Top-4.0chLtf/Rtf/Ltb/RtbThe top 4 channels (Ltf/Rtf/Ltb/Rtb) of 7.1.4ch + 9Stereo-FFL/FRThe front subset (FL/FR) of [=9.1.6ch=] - 73.0chL/C/RThe front 3 channels (L/C/R) of 7.1.4ch + 10Stereo-SiSiL/SiRThe side subset (SiL/SiR) of [=9.1.6ch=] - 8 ~ 255Reserved + 11Stereo-TpSiTpSiL/TpSiRThe top side subset (TpSiL/TpSiR) of [=9.1.6ch=] + + + 12Top-6chTpFL/TpFR/TpSiL/TpSiR/TpBL/TpBRThe top 6 channels (TpFL/TpFR/TpSiL/TpSiR/TpBL/TpBR) of [=9.1.6ch=] + + + 13 ~ 255Reserved +Loudspeaker location ordering of 9.1.6ch: FLc/FC/FRc/FL/FR/SiL/SiR/BL/BR/TpFL/TpFR/TpSiL/TpSiR/TpBL/TpBR/LFE1 + +Where FLc: Front Left Centre, FC: Front Centre, FRc: Front Right Centre, FL: Front Left, FR: Front Right, SiL: Side Left, SiR: Side Right, BL: Back Left, BR: Back Right, TpFL: Top Front Left, TpFR: Top Front Right, TpSiL: Top Side Left, TpSiR: Top Side Right, TpBL: Top Back Left, TpBR: Top Back Right, LFE1: Low-Frequency Effects-1 + +For a given input [=3D audio signal=] with an expanded channel layout defined in [=expanded_loudspeaker_layout=], [=num_layers=] SHALL be set to 1 (i.e., it is a non-scalable channel audio element). Except [=9.1.6ch=] [=Audio Element=], it is RECOMMENDED to use such an [=Audio Element=] as an auxiliary [=Audio Element=] to be mixed with a primary [=Audio Element=] (e.g., TOA or 7.1.4ch) within a [=Mix Presentation=]. If parsers encounter a [=loudspeaker_layout=] = 15 for any layer other than the first layer, they SHOULD skip the [=channel_audio_layer_config=] for that layer and all subsequent layers. + +The following channel layouts may be indicated using an existing [=loudspeaker_layout=] or [=expanded_loudspeaker_layout=]. The stereo pair FLc/FRc is indicated using Stereo (L/R), the stereo pair BL/BR is indicated using Stereo-RS (Lrs/Rrs), the stereo pair TpFL/TpFR is indicated using Stereo-TF (Ltf/Rtf), the stereo pair TpBL/TpBR is indicated using Stereo-TB (Ltb/Rtb), and FLc/FC/FRc is indicated using 3.0ch (L/C/R). + ### Scalable Channel Group and Layout ### {#scalalechannelaudio-channelgroupandlayout} When an [=Audio Element=] is composed of \(G(r)\) number of [=Audio Substream=]s, its scalable channel audio representation is layered into \(r\) [=num_layers=] of [=Channel Group=]s. @@ -1121,7 +1142,7 @@ The order of the [=Audio Substream=]s in each [=Channel Group=] (i.e., the seman - The [=coupled substream=]s for the surround channels come first and are followed by the [=coupled substream=]s for the top channels. - The [=coupled substream=]s for the front channels come first and are followed by the [=coupled substream=]s for the side, rear and back channels. - The [=coupled substream=]s for the side channels come first and are followed by the [=coupled substream=]s for the rear channels. -- The Center channel comes first and is followed by the LFE channel, and then the L channel. +- The Center (or Front Centre) channel comes first and is followed by the LFE (or LFE1) channel, and then the L channel. ### Ambisonics Config Syntax and Semantics ### {#syntax-ambisonics-config} @@ -1257,7 +1278,7 @@ class MixPresentationOBU() { loudness is an instance of the [=LoudnessInfo()=] class, which provides the loudness information for this sub-mix's [=Rendered Mix Presentation=], measured on the layout provided by [=loudness_layout=]. -The layout specified in [=loudness_layout=] SHOULD NOT be higher than the highest layout among the layouts provided by the [=Audio Element=]s. In other words, rendering from an [=Audio Element=] with the highest layout to the [=loudness_layout=] SHOULD NOT require an up-mix. The exception is when the [=Audio Element=] is a zero-order Ambisonics or Mono channel; they MAY be rendered to Stereo. In this exception case, the [=loudness_layout=] for a zero-order Ambisonics or Mono channel [=Audio Element=] SHOULD NOT be higher than Stereo. +The layout specified in [=loudness_layout=] SHOULD NOT be higher than the highest layout among the layouts provided by the [=Audio Element=]s. In other words, rendering from an [=Audio Element=] with the highest layout to the [=loudness_layout=] SHOULD NOT require an up-mix. In the case of a CHANNEL_BASED [=Audio Element=] with an expanded channel layout (i.e., [=loudspeaker_layout=] = 15), the [=Audio Element=] is considered to be providing the reference layout that it is a subset of. The exception is when the [=Audio Element=] is a zero-order Ambisonics or Mono channel; they MAY be rendered to Stereo. In this exception case, the [=loudness_layout=] for a zero-order Ambisonics or Mono channel [=Audio Element=] SHOULD NOT be higher than Stereo. Each sub-mix SHALL include [=loudness=] for Stereo (i.e., a [=loudness_layout=] with the [=sound_system=] field = [=Loudspeaker configuration for Sound System A (0+2+0)=]). - If a sub-mix's [=Rendered Mix Presentation=] is Mono, its [=loudness=] for Stereo SHOULD be measured on the Stereo signal generated using the equations: @@ -1417,7 +1438,7 @@ layout_type : Layout type - A value of 3 indicates that the layout is binaural. -sound_system specifies one of the sound systems A to J as specified in [[!ITU-2051-3]], 7.1.2ch or 3.1.2ch. +sound_system specifies one of the sound systems A to J as specified in [[!ITU-2051-3]], 7.1.2ch, 3.1.2ch, Mono, or 9.1.6ch. - 0: It indicates [=Loudspeaker configuration for Sound System A (0+2+0)=] - 1: It indicates [=Loudspeaker configuration for Sound System B (0+5+0)=] @@ -1432,7 +1453,8 @@ layout_type : Layout type - 10: It indicates the same loudspeaker configuration as [=loudspeaker_layout=] = 0110 (i.e., 7.1.2ch) - 11: It indicates the same loudspeaker configuration as [=loudspeaker_layout=] = 1000 (i.e., 3.1.2ch) - 12: It indicates Mono - - 13 ~ 15: Reserved + - 13: It indicates the same loudspeaker configuration as [=expanded_loudspeaker_layout=] = 8 (i.e., 9.1.6ch) + - 14 ~ 15: Reserved When a value for [=layout_type=] or [=sound_system=] is not supported, parsers SHOULD ignore this [=Layout()=] and any associated [=LoudnessInfo()=]. @@ -2418,12 +2440,15 @@ In this section, for a given x.y.z layout, the next highest layout x'.y'.z' mean This section defines the renderer to use, given a channel-based [=Audio Element=] and a loudspeaker playback layout. +22.2ch represents the [=Loudspeaker configuration for Sound System H (9+10+3)=]. + - The input layout (x.y.z) of the IA renderer is set as follows: - If [=num_layers=] = 1, - If [=loudspeaker_layout=] < 10, use the [=loudspeaker_layout=] of the [=Audio Element=]. - Else if [=loudspeaker_layout=] = 15, - If [=expanded_loudspeaker_layout=] = 1, use 5.1.4ch with empty channels everywhere other than the corresponding loudspeaker locations. - - Else, use 7.1.4ch with empty channels everywhere other than the corresponding loudspeaker locations. + - Else if [=expanded_loudspeaker_layout=] < 8, use 7.1.4ch with empty channels everywhere other than the corresponding loudspeaker locations. + - Else, use [=22.2ch=] with empty channels everywhere other than the corresponding loudspeaker locations except LFE2. LFE2 of [=22.2ch=] is copied from LFE1. - Else, if the [=Audio Element=] has a [=loudspeaker_layout=] that matches the playback layout, use that matching [=loudspeaker_layout=]. - Else, use the next highest available layout from all available [=loudspeaker_layout=]s. - The output layout of the IA renderer is set to the playback layout (X.Y.Z). @@ -2438,13 +2463,18 @@ This section defines the renderer to use, given a channel-based [=Audio Element= ##### Rendering Without Demixing Info ##### {#processing-mixpresentation-rendering-m2l-withoutdemixinfo} - If the playback layout is neither 3.1.2ch nor 7.1.2ch, - If the playback layout complies with the loudspeaker layouts supported by [[!ITU-2051-3]], the EAR Direct Speakers renderer ([[ITU-2127-0]]) can be used, for example. + - Else if the playback layout is 9.1.6ch, + - If the input layout is [=22.2ch=], the down-mix matrix specified in [[#processing-downmixmatrix-static]] can be used, for example. + - Else, the EAR Direct Speakers renderer ([[ITU-2127-0]]) can be used, for example, to first render the input audio to [=22.2ch=], followed by copying LFE1 to LFE2 and followed by down-mixing from [=22.2ch=] to [=9.1.6ch=] by using the down-mix matrix specified in [[#processing-downmixmatrix-static]]. - Else, an implementation-specific renderer can be used, for example. - Else if the playback layout is 7.1.2ch, - The EAR Direct Speakers renderer ([[ITU-2127-0]]) can be used, for example, to first render the input audio to 7.1.4ch, followed by down-mixing from 7.1.4ch to 7.1.2ch. The height channels of 7.1.4ch are down-mixed to the height channels of 7.1.2ch as follows: \[ \text{Ltf2} = \text{Ltf4} + 0.707 \times \text{Ltb} \] \[ \text{Rtf2} = \text{Rtf4} + 0.707 \times \text{Rtb} \] - Else if the playback layout is 3.1.2ch, - - If the input layout has height channels, the static down-mix matrices specified in [[#processing-downmixmatrix-static]] are used. + - If the input layout has height channels, + - If the input layout is [=22.2ch=], the EAR Direct Speakers renderer ([[ITU-2127-0]]) can be used, for example, to first render the input audio to 7.1.4ch, followed by down-mixing from 7.1.4ch to 3.1.2ch by using the down-mix matrix specified in [[#processing-downmixmatrix-static]]. + - Else, the static down-mix matrices specified in [[#processing-downmixmatrix-static]] are used. - Else if the surround channels (x) of the input layout > 3, the static down-mix matrices specified in [[#processing-downmixmatrix-static]] after inserting empty height channels into the input audio are used. - Else, empty channels are padded to the input audio relevant to the input layout to make 3.1.2ch. In that case, Mono is regarded as a center channel. @@ -2466,6 +2496,7 @@ This section provides guidelines about the renderer to use, given a scene-based - The output layout of the IA renderer is set to the playback layout. - The IA renderer used can be selected according to the following rules: - If the playback layout complies with the loudspeaker layouts supported by [[!ITU-2051-3]], the EAR HOA renderer ([[ITU-2127-0]]) can be used. + - Else, if the playback layout is 9.1.6ch, the EAR HOA renderer ([[ITU-2127-0]]) can be used, for example, to first render the input audio to [=22.2ch=], followed by down-mixing from [=22.2ch=] to [=9.1.6ch=] by using the down-mix matrix specified in [[#processing-downmixmatrix-static]]. - Else, if there is an implementation-specific renderer, use it. - Else, the EAR HOA renderer can be used to render to the next highest [[!ITU-2051-3]] layout compared to the playback layout, and then down-mix using an implementation-specific renderer or use the static down-mix matrices specified in [[#processing-downmixmatrix-static]] if available. @@ -2603,7 +2634,7 @@ This specification includes preferred dynamic down-mixing matrices generated by ### Static Down-mix Matrix ### {#processing-downmixmatrix-static} -This section provides includes preferred static down-mix matrices to render to 3.1.2ch from 5.1.2ch, 5.1.4ch, 7.1.2ch, and 7.1.4ch. +This section provides includes preferred static down-mix matrices to render to 3.1.2ch from 5.1.2ch, 5.1.4ch, 7.1.2ch, and 7.1.4ch and to 9.1.6ch from 22.2ch. Implementations can use a limiter defined in [[#processing-post-limiter]] to preserve the energy of audio signals instead of using normalization factors. @@ -2751,6 +2782,76 @@ The 3.1.2ch down-mix matrix for 7.1.4ch is given below, where \(p = 0.707\). \end{bmatrix} \] +The 9.1.6ch down-mix matrix for 22.2ch is given below, where \(p = 0.707\) and \(q = 0.5\). This down-mix matrix is generated based on Section 8.1 and Table 16 of [[!ITU-2127-0]]. + +\[ +\begin{bmatrix} + \text{FLc} \\ + \text{FC} \\ + \text{FRc} \\ + \text{FL} \\ + \text{FR} \\ + \text{SiL} \\ + \text{SiR} \\ + \text{BL} \\ + \text{BR} \\ + \text{TpFL} \\ + \text{TpFR} \\ + \text{TpSiL} \\ + \text{TpSiR} \\ + \text{TpBL} \\ + \text{TpBR} \\ + \text{LFE1} +\end{bmatrix} += +\begin{bmatrix} + 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ + 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 \\ + 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ + 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & p & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & p & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & q & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & q & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & q & p & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & q & p & 0 & 0 & 0 & 0 \\ + 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & p & 0 & 0 & 0 & 0 & 0 & 0 & 0 & p \\ +\end{bmatrix} +\times +\begin{bmatrix} + \text{FLc} \\ + \text{FC} \\ + \text{FRc} \\ + \text{FL} \\ + \text{FR} \\ + \text{SiL} \\ + \text{SiR} \\ + \text{BL} \\ + \text{BR} \\ + \text{TpFL} \\ + \text{TpFR} \\ + \text{TpSiL} \\ + \text{TpSiR} \\ + \text{TpBL} \\ + \text{TpBR} \\ + \text{LFE1} \\ + \text{BC} \\ + \text{TpFC} \\ + \text{TpC} \\ + \text{TpBC} \\ + \text{BtFL} \\ + \text{BtFC} \\ + \text{BtFR} \\ + \text{LFE2} +\end{bmatrix} +\] + +Where BC: Back Centre, TpFC: Top Front Centre, TpC: Top Centre, TpBC: Top Back Centre, BtFL: Bottom Front Left, BtFC: Bottom Front Centre, BtFR: Bottom Front Right, LFE2: Low-Frequency Effects-2 # Convention # {#convention}