Web Audio API: RenderCapacity API #843

hoch · 2023-05-10T22:49:15Z

I'm requesting a TAG review of RenderCapacity API.

Generally, the Web Audio renderer’s performance is affected by the machine speed and the computational load of an audio graph. However, Web Audio API does not expose a way to monitor the computational load, and it leaves developers no options to detect glitches that are the essential part of UX in audio applications. Providing developers with a “glitch indicator” is getting more important because the scale of audio applications grows larger and more complex. (Developers have been asking for this feature since 2018.)

Explainer: bit.ly/render-capacity-explainer (will be moved to a markdown doc soon)
Specification URL: https://webaudio.github.io/web-audio-api/#AudioRenderCapacity (Editor's Draft)
Tests: N/A
User research: N/A
Security and Privacy self-review²: See the Privacy Concern section in the explainer.
Primary contacts:
- Hongchan Choi (@hoch), W3C Audio Working Group Co-chair
- Paul Adenot (@padenot), Web Audio API specification editor
Organization(s)/project(s) driving the specification: W3C Audio Working Group
Key pieces of existing multi-stakeholder review or discussion of this specification: API proposal to preemptively determine audio thread overload (Render Capacity) WebAudio/web-audio-api#2444
External status/issue trackers for this specification (publicly visible, e.g. Chrome Status): https://chromestatus.com/feature/5180333360676864

Further details:

[v] I have reviewed the TAG's Web Platform Design Principles
Relevant time constraints or deadlines: 2023 Q2~Q3
The group where the work on this specification is currently being done: W3C Audio WG
The group where standardization of this work is intended to be done (if current group is a community group or other incubation venue): N/A
Major unresolved issues with or opposition to this specification: N/A
This work is being funded by: N/A

We'd prefer the TAG provide feedback as (please delete all but the desired option):
💬 leave review feedback as a comment in this issue and @hoch @padenot

hober · 2023-08-02T23:06:03Z

How are you defining "load" (as exposed in averageLoad and peakLoad? Is this "unix load average as reported by /usr/bin/w? Is there an equivalent concept on non-Unix platforms?

padenot · 2023-08-03T07:57:28Z

Audio systems, when rendering an audio stream, typically work with a synchronous audio callback (called in the spec a system level audio callback), that is called in an isochronous fashion, by the system, on a real-time thread, with a buffer of n frames, that the program must fill entirely, and the return as soon as possible.

This callback, called continuously during the lifetime of the audio stream, returns to the system with audio samples, and then hands it off to the rest of the OS. This audio might be post-processed and is usually output on an audio output device, such as headphones or speakers.

Let frames[i] be the number of frames that has to be rendered by the callback on the i-th iteration (a few hundreds to a couple of thousands is typical in this situation)
Let sr is the sample-rate at which the audio system runs (44100Hz or 48000Hz are typical values)
Let r[i] the time, in seconds, it took to render n frames this time. This is in other words, the execution time of the callback

frames[i] / sr is a number of audio frames, divided by the sample-rate, so it's a duration in seconds. It's the duration a buffer of frames[i] samples takes to be played out.

The load for this render quantum is:

load[i] = r[i] / (frames[i] / sr)

In a nominal scenario, the load is below 1.0: it took less time to render the audio than it takes to play it out. In an overload scenario (called under-run in the audio programming jargon), the load can be greater than 1.0. At this point, it is expected that the user will hear audio dropouts. This provokes discontinuities in the audio output and is very noticeable.

Because the time it takes to render the audio is usually directly controllable by authors (for example, by deciding to reduce the quality of some parts of the audio processing graph, that are less essential than others for the application), authors would like to be able to observe this load.

A real-life example that could benefit from this new API would be the excellent https://learningsynths.ableton.com/. If you open the menu by clicking the icon on the top left (on desktop), and scroll down this panel, you see that the render quality is controllable.

Similarly, it's not uncommon for digital audio workstations or other professional audio software to display a load indicator in their user interface, to warn the user that there's too much processing for the system in its current configuration.

In the Web Audio API spec, this is defined in the section Rendering an audio graph.

cynthia · 2023-09-07T06:39:27Z

This, compute pressure, and the worker QoS proposal seems to be all somewhat connected in terms of serving this kind of compute time guarantee needs (or lack of guarantee thereof) - would it make sense to distill some common patterns out of this for consistency?

hoch · 2023-09-07T17:43:27Z

That's an interesting suggestion. However, the level of precision in Compute Pressure API is not enough (4 buckets) and the design of the Worker QoS proposal seems quite distant from this API. (i.e. you're setting the option at construction time)

Based on the developer survey we conducted, the bucket size 4 is not suitable for anything useful. Another approach that we're discussing at the moment is using the strong permission signal (e.g. microphone) to allow the full precision of capacity value. Conversely, the API only offers limited buckets (~10) without explicit user permission.

cynthia · 2024-01-23T11:52:05Z

Sorry for the long delay. We've discussed this during our F2F, and having some level of consistency/interoperability between this proposal and compute pressure would be a better architectural direction. (Setting aside QoS and how to make that proposal consistent, as it seems much earlier stages)

Some questions for you:

Can you consider to have a common interface for pressure signal shared between Compute Pressure and RenderCapacity? If not, why?
Can you consider using an Observer for your use case? If not, why? (We will ask the opposite about Compute Pressure and events)
Where does the working group stand with respect to limiting the granularity of the pressure? Do you have agreement about limiting granularity in the absence of some gating function, like gaining permission for microphone or similar?

With all of these questions, we think the use cases are valid so no questions there.

kupix · 2024-02-21T09:44:29Z

There is a way to estimate render capacity that works today: capture timestamps before and after processing on the audio thread. This method isn't without challenges: timestamps can only be captured with 1ms precision (thanks Spectre) and the audio chunk rate may beat with the 1kHz timestamp clock. So aggregation/estimation requires a period of the order of 1 second to stabilise although this could probably be improved with more sophisticated timestamp processing.

See it in action: https://bungee.parabolaresearch.com/bungee-web-demo.

There may be a further challenge with adapting processing complexity according to render capacity. Occasionally something (browser or OS) seems to detect a lightly used thread and either move it to an efficient or low-clocked core. So, paradoxically, faster render code can sometimes result in increased render capacity. This is a "denominator problem" that needs more study.

Simple sample below (simpler averaging than link above).

class NoiseGenerator extends AudioWorkletProcessor {
  constructor() {
    super();
    this.active = this.idle = 0;
  }

  process(inputs, outputs, parameters) {
    const start = Date.now();
    if (this.idle) {
        this.idle += start;
        console.log("Render capacity: " + 100 * this.active / (this.active + this.idle + 1e-10) + "%");
    }
    this.active -= start;

    // generate some noise
    for (let channel = 0; channel < outputs[0].length; ++channel)
      for (let i = 0; i < outputs[0][channel].length; ++i)
        outputs[0][channel][i] = Math.random() * 2 - 1;

    const finish = Date.now();
    this.active += finish;
    this.idle -= finish;

    return true;
  }
}

registerProcessor('noise-generator', NoiseGenerator);

LeaVerou · 2024-03-25T21:57:15Z

Hello there! We looked at this today during a breakout.

Other TAG members will comment with other components of the review, but we had some questions wrt API design. We need to better understand how this API fits in to the general use cases where it will be used. Currently, the explainer includes a snippet of code showing this in isolation, where it is modifying parameters in the abstract. What is the scope of starting and stopping this kind of monitoring for the use cases listed? Are authors expected to listen continuously or sample short periods of time (because monitoring is expensive)?

If they are expected to listen continuously, then what is the purpose of the start() and stop() methods? If their only purpose is to set the update interval, that could be a property directly on AudioContext (in which case the event would be on AudioContext as well and would be named in a more specific way, e.g. rendercapacitychange).

We were also unsure what the update interval does exactly. Does it essentially throttle the event so you can never get more than one event per that period? Does it set the period over which the load is aggregated? Both? Can you get multiple update events without the load actually changing?

Lastly, as a very minor point, change is a far more widespread naming convention for events compared to update, see https://cdpn.io/pen/debug/dyvGYoV update makes more sense if the event fires every updateInterval regardless of whether there was a change, but it produces better DX to only fire the event when the value has actually changed so that every invocation is meaningful.

We were also wondering how this relates to #939 ?

martinthomson · 2024-03-25T21:57:46Z

An addendum on security... We think that the general approach to managing side-channel risk is acceptable.

Overall, fewer buckets would be preferable; at most 10, though preferably 5. Though surveys indicate that some number of sites would be unhappy with fewer than 10 buckets, there is an opportunity to revise the number of buckets over time based on feedback on use. Increasing the number of buckets should be feasible without affecting site compatibility. Starting with a more private default is the conservative option. Increasing resolution carries a small risk in that change events are more likely to occur more often (see API design feedback).

More detail is ultimately necessary to understand the design:

Is hysteresis involved?
Is the reported value a maximum/average/percentile?
What happens when the load exceeds 100%?

martinthomson · 2024-07-01T21:47:49Z

@hoch, @padenot, do you have any feedback on the questions above?

padenot · 2024-07-02T11:25:05Z

This is somewhat in pause for now at the Audio WG level, implementors aren't exactly sure how to ship this.

hoch added the Progress: untriaged label May 10, 2023

torgo added Venue: Web Audio WG and removed Progress: untriaged labels May 24, 2023

torgo assigned atanassov May 24, 2023

torgo added this to the 2023-05-29-week milestone May 24, 2023

torgo assigned cynthia May 24, 2023

plinss modified the milestones: 2023-05-29-week, 2023-08-14-week Aug 14, 2023

torgo modified the milestones: 2023-08-14-week, 2023-08-21-week Aug 20, 2023

plinss modified the milestones: 2023-08-21-week, 2023-08-28-week Aug 28, 2023

torgo modified the milestones: 2023-08-28-week, 2023-09-04-week Sep 3, 2023

haywirez mentioned this issue Sep 11, 2023

Allow user-selectable render quantum size WebAudio/web-audio-api#2450

Open

torgo modified the milestones: 2023-09-04-week, 2023-10-09-week Oct 8, 2023

torgo modified the milestones: 2023-10-09-week, 2023-12-18-week Dec 17, 2023

LeaVerou added the Progress: in progress label Dec 18, 2023

torgo modified the milestones: 2023-12-18-week, 2024-01-23-f2f-London Jan 8, 2024

torgo assigned hober and matatk and unassigned cynthia and atanassov Jan 23, 2024

plinss removed this from the 2024-01-23-f2f-London milestone Mar 11, 2024

torgo added this to the 2024-03-18-week milestone Mar 17, 2024

plinss modified the milestones: 2024-03-18-week, 2024-03-25-week:c Mar 25, 2024

LeaVerou added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review and removed Progress: in progress labels Mar 25, 2024

LeaVerou mentioned this issue Mar 25, 2024

New principle: Events should only fire during actual state transitions w3ctag/design-principles#483

Open

martinthomson mentioned this issue Mar 25, 2024

Early TAG review request for Playout Statistics API for WebAudio #939

Closed

1 task

plinss modified the milestones: 2024-03-25-week:c, 2024-04-22-week:c, 2024-04-15-week:c Apr 15, 2024

torgo modified the milestones: 2024-04-15-week:c, 2024-04-22-week:c Apr 21, 2024

plinss removed this from the 2024-04-22-week:c milestone Apr 29, 2024

torgo added this to the 2024-07-01-week:c milestone Jun 30, 2024

plinss removed this from the 2024-07-01-week:c milestone Aug 5, 2024

torgo added the Progress: stalled label Aug 8, 2024

torgo removed the Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review label Aug 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Audio API: RenderCapacity API #843

Web Audio API: RenderCapacity API #843

hoch commented May 10, 2023

hober commented Aug 2, 2023

padenot commented Aug 3, 2023

cynthia commented Sep 7, 2023

hoch commented Sep 7, 2023

cynthia commented Jan 23, 2024

kupix commented Feb 21, 2024 •

edited

Loading

LeaVerou commented Mar 25, 2024

martinthomson commented Mar 25, 2024

martinthomson commented Jul 1, 2024

padenot commented Jul 2, 2024

Web Audio API: RenderCapacity API #843

Web Audio API: RenderCapacity API #843

Comments

hoch commented May 10, 2023

hober commented Aug 2, 2023

padenot commented Aug 3, 2023

cynthia commented Sep 7, 2023

hoch commented Sep 7, 2023

cynthia commented Jan 23, 2024

kupix commented Feb 21, 2024 • edited Loading

LeaVerou commented Mar 25, 2024

martinthomson commented Mar 25, 2024

martinthomson commented Jul 1, 2024

padenot commented Jul 2, 2024

kupix commented Feb 21, 2024 •

edited

Loading