Add algorithms to the audio spec #23

youennf · 2024-10-15T17:26:57Z

As discussed at TPAC, this PR introduces algorithms to define the behavior of AudioSession, in particular how state is handled, how it is interacting with existing audio sources/sinks or with media session.

Preview | Diff

alastor0325 · 2024-10-15T23:22:48Z

index.bs

+* A <dfn data-lt="default type" for="audio session">default type</dfn>, which is used to compute the [=audio session=] [=audio session/type=], in case of "{{AudioSessionType/auto}}".
+* An <dfn>audible flag</dfn>, which is either `true` if the element is playing/recording audio or `false` otherwise.
+
+An audio session [=audio session/element=] is an <dfn>audible element</dfn> if its [=audible flag=] is `true`.


I believe we need to clarify how to set the audible flag to true. Does this flag accurately reflect the actual audio output? We haven't yet discussed how to bind a media element (or other audible sources) to an audio session, which makes it somewhat confusing to define what constitutes an audible session.

Perhaps we should start by defining the criteria for determining if a session is audible and outline which types of elements can be used as sources.

This is defined later on for each audio source/sink in the Audio source and sink integration.
There are 3 audio elements being defined: media element, AudioContext, microphone MediaStreamTrack.

Regarding audio sources, I believe it would be clearer to slightly mention them here and link to the section below. This approach will help readers understand the audible behavior associated with the actual source, rather than just the AudioSession element.

For the audible flag, I recommend creating a separate section to explain it comprehensively. Even in Section 6, "Audio Source and Sink Integration," the definition of "audible" is not adequately addressed.

For example, if a non-muted media element is playing a silent audio track or has its audio track volume set to near inaudible levels, should this element be considered audible?

Additionally, how should we define "audible" for non-media elements, such as AudioContext and Web Speech?

If you agree with this approach, we could file another issue and create a separate PR for it.

I filed #27 and #28 to keep track of this.

alastor0325 · 2024-10-15T23:34:26Z

index.bs

+A [=top-level browsing context=] is said to have <dfn lt="audio focus" for="audio session">audio focus</dfn> if its [=selected audio session=] is not `null` and its state is {{AudioSessionState/active}}.
+
+<div class=note>
+User agents can decide whether to allow several [=top-level browsing context=] to have [=audio focus=], or to enforce that only a single [=top-level browsing context=] has [=audio focus=] at any given time.</div>


Could you provide an example of a situation where multiple audio focus is necessary? Is it premature to mention audio focus at this stage? It might be more effective to include a full paragraph defining what audio focus is. Perhaps we should explain it thoroughly after introducing the different types?

For instance, does an audio session with the ambient type qualify as having audio focus, given that it can play and mix with other types?

Additionally, should we consider adding an attribute to the AudioSession to indicate whether it is a selected audio session?

On desktop, UAs tend to allow multiple tabs to render audio, similar to several native applications playing audio at the same time (without being ambient).

For instance, does an audio session with the ambient type qualify as having audio focus, given that it can play and mix with other types?

It does not qualify.

should we consider adding an attribute to the AudioSession to indicate whether it is a selected audio session?

We could consider it but I am a bit reluctant to do this for now until we have a good use case.

Thanks for clarification, I think it would be worth to provide some examples here. But we could do that in following up PRs.

I filed #29.

alastor0325 · 2024-10-15T23:45:19Z

index.bs

@@ -40,7 +72,33 @@ The {{AudioSession}} is the main interface for this API, which is accessed throu
  };
 </pre>

-## Audio session states ## {#audio-session-types}
+To create an {{AudioSession}} object in |realm|, run the following steps:


How about moving "Extensions to the Navigator Interface" to this section to improve the clarity of the specification? First, we could establish that the AudioSession is a singleton per window, followed by the steps for creating an AudioSession.

Additionally, I would like to understand the purpose of isTypeBeingApplied. For instance, if we first set the type to A and then immediately change it to B, the current specification indicates that type B won't be applied correctly because isTypeBeingApplied will remain true for type A if the queued task hasn't executed. What is the rationale behind this mechanism? Shouldn't we allow a new type to be set for an AudioSession at all times?

I am fine reshuffling a bit the spec to make it clearer. I'd like to minimize the rebasing of this PR though.

Wrt isTypeBeingApplied, the principle is that we only update the audio session type once per event loop.
The AudioSession object's .type can be changed many times, but only one task will be run to update its audio session type, with the last type set on the AudioSession object.

I am fine reshuffling a bit the spec to make it clearer. I'd like to minimize the rebasing of this PR though.

Sounds good, thanks!

I am fine reshuffling a bit the spec to make it clearer. I'd like to minimize the rebasing of this PR though.

It sounds like it's similar with updating a MSE source buffer? If so, should we add a updating to indicate the update is still ongoing? In addition, is there a particular reason we only want to update type once per event loop?

Let's keep this one open until we discuss the exact algorithms.

youennf · 2024-10-18T07:40:49Z

Converting to draft until the other sections are upstreamed.

alastor0325 reviewed Oct 15, 2024

View reviewed changes

youennf force-pushed the add-algorithms-without-html branch from 3982a24 to f56f241 Compare October 16, 2024 11:21

This was referenced Oct 18, 2024

Add a more precise definition of audible element #27

Open

WebSpeech should be considered as an audio session element #28

Open

Provide more explanations/examples around audio focus #29

Open

youennf marked this pull request as draft October 18, 2024 07:40

youennf added 2 commits October 18, 2024 09:41

Add algorithms

efc663f

Audio source and sink integration

c3ef202

youennf force-pushed the add-algorithms-without-html branch from f56f241 to c3ef202 Compare October 18, 2024 07:41

This was referenced Oct 18, 2024

Update type and state sections #31

Merged

Algorithm to create an AudioSession #30

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add algorithms to the audio spec #23

Add algorithms to the audio spec #23

youennf commented Oct 15, 2024 •

edited by pr-preview bot

Loading

alastor0325 Oct 15, 2024

youennf Oct 16, 2024

alastor0325 Oct 17, 2024

youennf Oct 18, 2024

alastor0325 Oct 15, 2024

youennf Oct 16, 2024

alastor0325 Oct 17, 2024

youennf Oct 18, 2024

alastor0325 Oct 15, 2024

youennf Oct 16, 2024

alastor0325 Oct 17, 2024

youennf Oct 18, 2024

youennf commented Oct 18, 2024

Add algorithms to the audio spec #23

Are you sure you want to change the base?

Add algorithms to the audio spec #23

Conversation

youennf commented Oct 15, 2024 • edited by pr-preview bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youennf commented Oct 18, 2024

youennf commented Oct 15, 2024 •

edited by pr-preview bot

Loading