-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API proposal to preemptively determine audio thread overload (Render Capacity) #2444
Comments
I think this is really hard to specify in any kind of meaningful way, let alone some kind of meaningful inter-operable way. It's really hard to know if something is overloaded until after the fact, which is too late for your use case. This is further complicated by the fact that the audio graph can be quite dynamic. So you might have a nice graph working at 1% CPU. Then you connect in a very complex subgraph suddenly that would want to use 200% CPU. Is it ok to signal overloaded after the fact? |
Here's something we can do: I believe the metaphor on DAW's CPU bar is somewhat inappropriate, because the CPU gauge on GUI is not an exposed programmable API. What's being asked here is a JS API that can be abused programmatically for so many different reasons, so we should tread carefully. In general, I support this idea if we can pass the security/privacy review. |
Thinking about it some more, I think the following idea could be a (relatively) accurate approximation, although this is an implementation detail. Since the audio rendering thread is a single thread (for the most part), and there are platform specific APIs to determine which CPU core a given thread is running on, we compute the average load of this CPU core (e.g. over the last second or so) and determine whether it is above a certain threshold. Say this is done on a one second poll check by default. When an
Interesting insights, this is primarily the reason I proposed a simple true/false overload detect (instead of a load percentage), which should provide a far smaller surface for fingerprinting, especially if user agents may add a random jitter. |
Additionally, this API -- by definition -- cannot always hard-protect against audio thread overloading and glitching, there are way too many factors for that (such as a heavy background process starting on the same machine, bad app response to audio thread overloading, or even the immediate addition of a large sub-graph as you mentioned). This API would merely help avoid glitching in many (perhaps most) common cases, and help apps provide measures to help the user actively avoid glitching. For this reason, I agree the API naming Perhaps it would be better to call it audioContext.isOverloading(); // 'true' or 'false' audioContext.addEventListener("overloadchange", function (e) {
e.isOverloading; // 'true' or 'false'
}); |
I think a binary isOverloaded will almost never work because the overload point won't match the apps developers expectation. A coarse CPU percentage that @hoch proposes at least allows the developer to decide what overload means. Which is not to say I support this idea. of overload :-) I'd much rather not do this and let each browser integrate WebAudio into the browser's developer tools. Then we can provide much more information including things like overload, but also how much cpu each node in the graph is using or mark out the heaviest subgraph or whatever else might be useful for analyzing your application. And no spec change required which means these can be iterated much faster. |
How would this solve the problem of audio thread overload detection on the end-user's machine? The threshold of hitting max load depends greatly on device capabilities and "ambient" CPU load (e.g. background tasks), and is practically impossible to know or even approximate in advance (i.e. a stronger machine will be able to handle more content than what is determined on a dev machine, while a weaker one will only handle less content in general). This is not a question of optimizing the audio graph at development time (that's pretty feasible to do already), but a question of how much audio processing the user can do before hitting max load, on the user's machine. |
Yeah, it doesn't really help. It helps you as a developer to know what's using up CPU so you can make intelligent trade-offs, but doesn't help you to know if a particular machine is having problems. I just don't know what that metric would be and making it meaningful in all the different possibilities that cause audio glitches. |
I agree that it would be really valuable to have some kind of API to expose this information, whether it's @hoch's coarse CPU use or @JohnWeisz's callback. After-the-fact could even be fine I think for our use cases. Better integration with dev tools would also be really great but an API would in addition allow dynamically reconfiguring/modifying the graph as we approach the resource limits on a given client. |
To clarify, our use case would be simply displaying a warning notification about the user approaching maximum audio thread capacity -- as mentioned before, it's a DAW with no artificial limitation on the amount of content, and it would be great if the user was given an indication about audio overload, other than an already glitching audio. This, in essence, is similar to the CPU-use meter, often present in similar desktop software, but for the audio thread only. |
Here is an alternative that you can implement now without waiting for the spec to get updated. Create an offline context with an appropriately heavy graph. Time how long it takes to render. There's your CPU usage meter. You could probably do this for each DAW component to get an idea of the expense of adding the component. You won't get the feedback that built-in cpu meter would give, but perhaps good enough. |
One thing to bear in mind with such an API is that it is generally the rendering quantum that takes the most cpu that will be most likely to cause a break in the output. I'd imagine an API similar to the one proposed by hoch in https://github.com/WebAudio/web-audio-api/issues/1595#issuecomment-386123070 but reporting a maximum ratio over some time period. |
The problem with this approach is that it lags behind significantly, and rapidly allocating OfflineAudioContext has a significant performance overhead. Worse, it will run on a different thread than the real-time audio thread (i.e. AudioContext), and will merely give a generic hint at what the CPU usage can be on some core (which may or may not be the same as the one used by AudioContext). Currently, the easiest but still relatively accurate approach is monitoring the load of each CPU core (this requires above-web-standard privileges, such as Electron), and considering the highest load as the CPU load. This often results in false positives, but if the audio thread is indeed approaching overload, it will detect it. |
@JohnWeisz what is the priority for your application, live or rendering? If it's rendering, we could ask for an audioContext method to modify in live the No need to have a measure. And maybe this can be done. |
Do you mean changing the sample rate of the context after it's been created? Switching the sample rate can cause glitches from the switch, and you still have to resample the audio if the hardware doesn't support the new rate. The resampling introduces additional delay and CPU usage. That might not be what the user wants, and it's kind of hard to know what the additional delay and CPU would be. |
@rtoy, ah... and is there no way to reduce the context quality after created? I was thinking that reducing the sampleRate was something like reducing the resolution of a video. |
Currently no, there's no way. And yes, reducing the sample rate is kind of like reducing the video resolution. But we still have to resample to the HW rate, and it's not clear what the right tradeoff should be. High quality output (with high CPU usage and possibly significant additional latency)? Low quality output (with lower CPU usage and possibly less additional latency)? Something in between? It's hard to know what the user wants when just switching sample rates. This is one reason Chrome has not yet implemented the AudioContext sampleRate option; we don't know what the right tradeoff should be, but Chrome generally prefers high quality when resampling is needed. |
@mr21 It's both live and rendering (you can play your project in real time, or render to an audio file), but AFAIK, rendering cannot glitch ever, as it will simply render slower if available CPU power is insufficient. This is only applicable to live playback, where there is only a limited time-frame to process a render quantum. |
@JohnWeisz, yes it's what i've though. |
In case the binary audioContext.getOverloadState(); // "none" | "moderate" | "critical" audioContext.addEventListener("overloadchange", (e) => {
e.overloadState; // "none" | "moderate" | "critical"
}); Where While the exact numbers can vary from device to device, this is something we found considerably accurate to detect overload in advance (in our Electron-based distribution, which can query CPU load accurately). |
I like @hoch's idea: Just return a CPU percentage, rounded to say the nearest 10% ro 20% or something coarse, but not too coarse. Then we don't have to make random assumptions on the meaning of "none", "moderate", or "critical". |
So now developers need to handle unpredictable difference between devices/browser. The sniffing part will be really ugly. I did not use the term "CPU percentage" for this specific reason. As I suggested above, the ratio between the render timing budget and the time spent on rendering is quite clear and there's no room for creative interpretation. IMO the sensible form for this attribute is: enum RenderPerformanceCategory {
"none",
"moderate",
"critical"
}
partial interface AudioContext : BaseAudioContext {
// renderPerformance = timeSpentOnRenderQuantum / callbackInterval
readonly attribute RenderPerformanceCategory renderPerformance;
} Concerns This technically gives you an estimate of the performance of real-time priority thread provided by OS. That is why I believe the more coarse the better. Even with 3 categories, you can infer 3 classes of CPU powers from all of visitors. This is why the web platform does not have any API like this. const context = new AudioContext();
const mutedGain = new GainNode(context, { gain: 0.0 });
mutedGain.connect(context.destination);
let oscCounter = 0;
function harassAndCheck() {
const osc = new OscillatorNode(context);
osc.connect(mutedGain);
osc.start();
oscCounter++;
if (context.renderPerformance !== 'critical') {
setTimeout(AbuseAndCheck, 10);
} else {
console.log('I can run ' + oscCounter + ' oscillators!');
}
}
harassAndCheck(); If you want to have a more fine-grained estimation, gain nodes can be exploited instead of oscillators. If you want to detect it fast, go with convolvers. So the coarseness of categories can be extrapolated by using various features in Web Audio API. Here I am not trying to discourage the effort or anything, but simply want to check all the boxes until we reach the conclusion. Personally I want to have this feature in the API. |
I believe the main question here is whether the snippet you demonstrated can expose more information than what is already available. Unless I'm mistaken, So again, unless I'm mistaken, running a benchmark inside the audio worklet processor should give significantly more accuracy than what you could ever get with the |
Can you elaborate? I am curious about what can be done with the AudioWorklet. If that's possible, we can actually build a prototype with the AW first. On the other hand, the AW might have some privacy concerns we have not explored.
This is not true for Chrome; the AudioWorklet requires the JS engine on the rendering thread, so it actually degrades the thread priority when it is activated by |
ahhh... indeed this could increase our fingerprint on the web by measuring the audiocard performance :/ |
@mr21 |
AudioWorklets should not run on a real-time priority thread. Chrome's worklet definitely doesn't. |
I don't quite expect AudioWorklet to be that useful for determining CPU load in practice (perhaps the best we can do is measure how much looping is needed before glitching out), but I was under the impression it was useful to benchmark the hardware itself (in this case, the
If I understand right, this seems to be relevant (469639 referenced within is non-public, so I can only guess). Then I was indeed mistaken (but ouch, this also means an audio worklet processor has a permanent potential stability impact just by being loaded?). In this case, discard my previous assumptions, AudioWorklet is clearly not useful for benchmarking a real-time priority thread (according to the linked issue, only a "display-priority" one), and might not be considerably better than a simple scripted benchmark on the main thread. I should've studied the implementation more in depth. |
I thought the game audio engine would want to monitor the current render capacity, so it can dynamically control the application load. Still, thanks for the feedback! |
We definitely do, but since we use a WebWorker/SAB/AudioWorklet architecture, we immediately detect underruns at the start of the AudioWorklet callback simply by inspecting the number of audio frames produced by the WebWorker since the last callback. No need for an event-based approach in this case. |
Ah, I see. That's actually a good point. I understand that you can implement the glitch detection within AudioWorkletProcessor - but would it be useful to have a built-in detection/monitoring feature on AWGS? |
@hoch perhaps you can clarify a detail here. Given
What is the rounding strategy? Consider using the default parameters, as you outline that'd be 375 events per second. If there is just a single underrun the ratio would be 1/375. What would the resulting reported ratio float be? I would assume 0, ie that single underrun would go undetected. Assuming then small ratios get rounded to zero, the counter to that would be to calculate update interval given samplerate (and, soon the render quantum size). As an alternative to this, why not just report the total number of frames with underruns as well as the total number of frames for the given interval? EDIT: rewrote a sentence |
Good point! We need to develop a rounding strategy to avoid the situation. A clear distinction between zero and non-zero would solve this issue. For example, 0 would be 0, but 1/375 would be 0.01. Some details need to be fleshed out, so thanks for raising this question!
The bucketed approach helps us lower the privacy entropy and reporting the exact number of underruns is something that we want to avoid. My goal is to maintain the privacy entropy as low as possible while keeping it useful to developers.
Unfortunately this is sort of already exposed via AudioContext.baseLatency. Since this number is very platform specific, so it adds more bits to the fingerprinting information. I don't think we want to duplicate the info through this new feature. |
This looks pretty good to me! Like @ulph, I was also wondering about the way that the This makes me wonder if there's simply a better term than A few nitpicks if this exact language ends up being used:
|
Fixed nits. Thanks! underrunRatio is (where N is the number of render quanta per interval period, u is the number of underruns per interval period):
I will think about if we need the same treatment for the average and max capacity, and feel free to chime in if you have any suggestions. |
@hoch This looks great! Just out of curiosity is the lower bound on the updateInterval 1s or can it be set even lower? |
@andrewmacp 1 is a placeholder value, but I think it's a reasonable default. How much lower do we want to go? and why? Another example nearby would be ComputePressure API and it also has a rate-limited event:
|
@hoch I was mostly just curious here, I think the underrunRatio should give us enough of what we need but was wondering if we can simply set the updateInterval to a value that would result in <=100 renders per callback in order to get a specific number of underruns. But then that would defeat the purpose of using a ratio to begin with so was wondering if the plan was to limit the updateInterval to a 1s minimum like in the ComputePressure API. |
@hoch noted, your suggested rounding scheme would work.
doesn't hurt to extra clear I suppose - but the "ceil to hundredth" would cover the 0 < x < 0.01 case as well?
I don't think the rounding is as problematic for the capacity ones - the issue with the underrun ratio was that (repeated) single underruns can be quite detrimental, esp. consider the case of occasional single glitches occurring every few seconds. Some ideas for capacity though; does it make sense to throw some more statistical numbers in there? Like min and stdev? That would give some more indication of distribution of the capacity measurements during the measurement window. (Personally I would have pondered a histogram but I suspect that's not everyones cup of tea) |
Re: @andrewmacp
I believe 1s is sensible default. I don't have a strong opinion on its lower boundary as long as it's reasonably coarse. (e.g. 0.5 second or higher) This is up for discussion. Re: @ulph
Thanks! That's better.
IMO those (min, stddev, histogram) are in the scope of "good-to-have". I suggest we deliver the first cut with the most essential data, and extend it later when it's absolutely needed. |
The actual problem shouldn't be the CPU overload, it should be processing not completing within the cycle time - no matter what the reason is, such overruns are a problem. |
Yes. This is basically how the "averageCapacity" is calculated. This capacity might be related with the CPU usage to a certain degree, but they clearly have different meanings. |
This is something we can already do right now, using audio worklets. The problem is that you only get to know when you're already overloading - there's no way to see how much headroom you still have before overloading would occur. For us this is critical, we want to get a solid understanding of how much headroom for additional processing we have on the clients machines. I suppose the better way is to measure the raw execution time for the rendering of one audio block. Since the sample rate and block size is known, this should give a direct "load percentage". |
Just a note here that, additionally to @hoch proposal here, #2413 can be of interest to you if you want to record time yourself. The current status is that it's going to be added (see my last message there). The feature in this issue is still nice to have a more global load metric including native nodes, that can be significant in terms of load (HRTF panning, Convolver, but also an accumulation of cheaper node for a complex processing graph, etc.). During development (meaning that it doesn't eliminate the need for the two things just mentioned), both Firefox and Chrome have profiling capabilities that allow drilling down and getting the execution time of each https://web.dev/profiling-web-audio-apps-in-chrome/ and https://blog.paul.cx/post/profiling-firefox-real-time-media-workloads/ for Chrome and Firefox, respectively. |
The Web Audio API defines audio production apps as a supported use-case, such as wave editors, digital audio workstations, and the like. And it's indeed quite adequate at the task!
One common trait of these applications is the capability of accepting virtually unlimited user content, which will essentially always result in hitting the limit of audio processing capabilities on a given machine at one point -- i.e. there is so many audio content that the audio thread simply cannot keep up with processing everything, it becomes overloaded, no matter how well it is optimized.
I believe this is widely and well understood in the audio production industry, and usually the solution to help the user avoid overloading is displaying a warning indicator of some kind, letting the user know the audio processing thread is about to be overloaded, and audio glitching will occur (so the user can know they should go easy on adding more contents).
Note: in native apps (mostly C++ based), this is most commonly implemented as a CPU load meter (for the audio thread's core), which you can keep your eye on to know how far you are from the limit, roughly.
Currently, the Web Audio API does not expose a comparable API to facilitate monitoring audio processing load, or overload.
It's possible to figure it out, mainly in special cases with above-web-standard privileges (such as an Electron app). However, this is quite difficult to get right (even from native C++ side) without implementations taking a spec-defined standard into consideration.
I'd wish to propose a small set of light, straightforward, low-privacy implication API additions to enable this:
For obvious reasons, this is an extension to the
AudioContext
interface, notBaseAudioContext
, as overload detection is not applicable for OfflineAudioContext processing. Having an event dedicated for the same purpose avoids the need of a polling check.It is up to implementations to decide how exactly it is determined whether the audio thread is considered overloaded or not, optionally taking the AudioContext
latencyHint
setting into consideration.This would enable Web Audio API-based apps let the user know about high audio thread load, and display possible options or hints at steps for the user to take to avoid audio glitching.
Privacy implications
Exposing this information should have little to no privacy implications, as (1) it is rarely clear why exactly the audio thread is overloaded (it could be due to low device capabilities, or high CPU use by other processes), and (2) it does not provide a more accurate way to determine device capabilities than what is already possible with a simple scripted benchmark.
The text was updated successfully, but these errors were encountered: