Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move audio mixing out of AudioStreamPlayer nodes #2299

Closed
ellenhp opened this issue Feb 16, 2021 · 4 comments
Closed

Move audio mixing out of AudioStreamPlayer nodes #2299

ellenhp opened this issue Feb 16, 2021 · 4 comments
Milestone

Comments

@ellenhp
Copy link

ellenhp commented Feb 16, 2021

This is one big proposal because the proposed features are tightly coupled. Also if it's accepted I'm up for implementing it, so I don't think it'll take resources away from other efforts except for review and discussion.

Describe the project you are working on

Over the past year or so: XR games, audio games, and generally just fixing audio bugs in Godot

Describe the problem or limitation you are having in your project

  1. Audio players (2d and 3d) pop if they're still playing in queue_free(). static is emitted when freeing an AudioStreamPlayer or resetting its stream property godot#25087 This can technically be worked around in a script by adding a dummy node to the tree and reparenting all audio players you wish to remove from the tree to that node (preserving global transform), stopping them, then adding them to a queue to be freed. While it can be mitigated, this should be the default behavior. I don't think anyone who queue_frees an audio player wants a pop. Related: Audible click/pop when audio bus changes godot#21312

  2. Resampling behavior in Godot varies from mediocre to poor. For WAVs, we use linear interpolation, and other streams use cubic interpolation. Bitrate-like artifact when playing low frequency sounds godot#23544 (and duplicates)
    This is mitigated for oggs and mp3s at the cost of performance by Implement a new resampling algorithm in AudioStreamPlaybackResampled godot#46086. A change like this would probably not be appropriate for wavs though, since there could be a huge number of wavs playing at once (resampling is done on a per-player basis and must therefore be extremely performant).

  3. There have been a lot of bugs that have been attributed to data races in the audio system, including two that are currently open. Some people have reported NaNs in the audio buffer in one of these, too. Audio clipping / static / interference on rapid intervals of sound (fixed in master) godot#22016 Audio stops working after scene reload. godot#28110

Describe the feature / enhancement and how it helps to overcome the problem or limitation

  1. Stereo panning code (including SPCAP) will be moved into AudioServer or a helper class. Instead of mixing audio directly into the AudioServer's buffers, AudioStreamPlayer* nodes will mix into an intermediate buffer provided by the AudioServer. The AudioServer will then provide that buffer to its chosen panning method (SPCAP for now) to mix it into the audio bus's buffer. Decoupling the mixing logic from the node playing back the audio like this allows for fadeouts to be easily accomplished even when the playing node is queue_freed.

  2. Resampling will be batched. Currently, a project playing 5 wav files with a 48khz sample rate running on a system with a 44.1khz audio driver will perform a linear resampling step 5 times, one for each wav audio stream. That scales with the number of sources playing, which is probably why doing it's using linear interpolation as opposed to something with better quality. If AudioStreamPlayer nodes mix into an intermediate buffer provided by the AudioServer, there's no need to perform resampling 5 times. The AudioServer could batch all 5 players that are operating at 48khz into one intermediate buffer, then resample to 44.1khz once and mix the resampled buffer into the audio bus's buffer. This allows Godot to default to a higher-quality resampling method, and moves all resampling logic out of the AudioStream classes. Currently resampling is duplicated in several places. I glossed over some of the details here, especially involving consideration of per-node pitch_scale stuff, but none of that is unsolvable, it would just force the AudioServer to deal with a pitch-scaled node separately.

  3. Since most of the brains of the audio mixing will be moved out of the stream player nodes into the AudioServer (or some other class directly controlled by the AudioServer) this gives us a unique opportunity to ensure that the data being passed from the nodes about how to play back that node's audio is handled in a lock-free, thread-safe way.
    I talked with @fire about this and he suggested creating a resource that contains all the information the mixing system needs about each individual player node (like its transform, audio volume, emission angle, doppler data, etc). I'm calling this a "PlaybackParams" resource in my head. I looked into this a bit and I think that each node could create a new instance of a PlaybackParams resource with up-to-date data every frame and atomically swap in its RID so that the audio thread never has to wait for a copy to finish. Partitioning the data like this completely prevents data races.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

I don't think a mockup or diagram would be helpful for this proposal but if the details given above aren't sufficient to also satisfy this section let me know and I'll build a proof of concept PR. I wrote some sloppy pseudocode showing how batching could occur, though. It has lots of problems like no consideration of multiple audio buses, no keeping track of the resampler's state between mixes, no fadeout/fadein, and no surround sound. But it shows the basic algorithm of grouping by sample rate, then grouping again by the way the sound should be decoded from 2d/3d space to stereo.

// AudioServer.h
map<AudioStreamPlayback, RID> playing_streams_and_playback_params;
PanningMethod panning_method;
AudioFrame intermediate_buffer_main[buffersize];
AudioFrame intermediate_buffer_secondary[buffersize];

// AudioServer.cpp
void mix() {
    map<uint32_t, vector<AudioStreamPlayback>> streams_by_sample_rate;
    for stream in playing_streams_and_playback_params.keys() {
        streams_by_sample_rate[stream.get_effective_sample_rate()].push_back(stream);
    }
    for sample_rate in streams_by_sample_rate.keys() {
        memset(intermediate_buffer_main, 0, sizeof(intermediate_buffer_main));
        map<PanningParams, vector<AudioStreamPlayback>> streams_of_current_sample_rate_by_playback_params;
        for stream in playing_streams_and_playback_params {
            streams_of_current_sample_rate_by_playback_params[stream.get_playback_params()].push_back(stream);
        }
        for panning_params in streams_of_current_sample_rate_by_playback_params.keys() {
            vector<AudioStreamPlayback> current_streams = streams_of_current_sample_rate_by_playback_params[panning_params];
            for stream in current_streams {
                memset(intermediate_buffer_secondary, 0, sizeof(intermediate_buffer_secondary));
                // Care should be taken to ensure that godot doesn't overrun the intermediate buffer by mixing too many samples, because this would likely cause an exploitable heap corruption.
                stream.mix_without_resampling(intermediate_buffer_secondary, delta_t)
            }
            // Mix this intermediate_buffer_secondary into our main intermediate_buffer.
            panning_method.mix_to_stereo(panning_params, intermediate_buffer_secondary, intermediate_buffer_main);
        }
        resample_and_add_to_bus_buffer(intermediate_buffer_main, audio_bus_buffer)
    }
}

If this enhancement will not be used often, can it be worked around with a few lines of script?

Issue 1 mentioned above could be mitigated with script, but issue 2 can't be solved without taking a serious performance hit.

Is there a reason why this should be core and not an add-on in the asset library?

The audio system is currently very difficult or impossible to extend without changes in core.

@Calinou
Copy link
Member

Calinou commented Feb 16, 2021

See also #2298.

@Calinou Calinou changed the title Move audio mixing out of AudioStreamPlayer nodes and overhaul audio resampling. Move audio mixing out of AudioStreamPlayer nodes and overhaul audio resampling Feb 17, 2021
@reduz
Copy link
Member

reduz commented Feb 17, 2021

I have been thinking about it, makes sense in many points, here is my feedback:

  • Stereo and positional audio I would leave as-is, if the only place that uses it is AudioStreamPlayer3D, then it makes no sense to move it out from there.

  • Resampling I would leave as is, the way it works now wont be too expensive and can discern between 1) Streams that generate audio to the right output freq 2) Streams that resample via ringbuffer (ogg/mp3/etc) 3) Streams that resample on the fly directly interpolation from the source (wav), as each serves its purpose.

  • Registering streams in AudioServer and asking them to mix in a buffer does make a lot of sense, since the following situations could be solved:

    • Properly moving from one (or multiple buses) to another, doing the right fades.
    • When streams are un-registered or stopped, AudioServer can ask them (outside of the audio thread) to prepare an extra mix block, which will be used to fade out at the next mix callback,and hence avoid clicks/pops.

    The new API to register streams would also need to be told which buses to send, and which amount.

@ellenhp
Copy link
Author

ellenhp commented Feb 17, 2021

I can work with this! Makes sense. I'm glad you like the idea of the audio server taking a more active role in mixing and doing fades itself since I think that's a good architectural change. Keeping resampling as-is makes sense as long as there are no performance concerns for switching to cubic or better everywhere. I'll hack away at this and see what I come up with. Exciting times!

@ellenhp
Copy link
Author

ellenhp commented Aug 27, 2021

This proposal has been implemented.

@ellenhp ellenhp closed this as completed Aug 27, 2021
@Calinou Calinou added this to the 4.0 milestone Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants