Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for chunked analysis #42

Merged
merged 2 commits into from
Mar 15, 2021
Merged

Conversation

rawler
Copy link
Contributor

@rawler rawler commented Mar 8, 2021

This PR enables an application to do "chunked" analysis of long streams. When analyzing multiple hours of audio, analyzing it in chunks allows usage of multi-core to accelerate the analysis, for both decoding and the analysis itself.

Used to take reference to a `Samples`-implementing structure. This
structure typically contains references in itself, so a `&impl Samples` is
double indirection anyways.
src/ebur128.rs Outdated Show resolved Hide resolved
src/ebur128.rs Outdated Show resolved Hide resolved
src/ebur128.rs Outdated Show resolved Hide resolved
@rawler
Copy link
Contributor Author

rawler commented Mar 10, 2021

Sorry for missing loudness_global_multiple in the first iteration of this. Turns out, all required for bit-exact parallel analysis is the seeding, to solve initialization of filter-states.

@rawler
Copy link
Contributor Author

rawler commented Mar 10, 2021

If we are willing to add a (feature-gated) dependency on rayon, I could perhaps add some "ParallelAnalyzer" implementation, that expects a stream of impl Samples, and through Rayon automatically chunks and analyzes the stream in parallel. I'm going to write such implementation for our application anyways. Let me know if that's desirable.

@sdroege
Copy link
Owner

sdroege commented Mar 10, 2021

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

src/ebur128.rs Outdated Show resolved Hide resolved
src/ebur128.rs Outdated Show resolved Hide resolved
src/ebur128.rs Outdated Show resolved Hide resolved
src/ebur128.rs Outdated Show resolved Hide resolved
src/filter.rs Show resolved Hide resolved
src/true_peak.rs Outdated Show resolved Hide resolved
@sdroege
Copy link
Owner

sdroege commented Mar 14, 2021

Thanks, looks mostly good to me :)

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

I would still be interested in this though!

@rawler
Copy link
Contributor Author

rawler commented Mar 14, 2021

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

Sorry, I spent some days over last week building this, just to realize that decoding the audio itself is ~40% of the CPU spent in our application. So, analyzing chunks in parallel didn't really improve performance more than simply decoding in a separate thread does. If we want to chunk, we'll going to have to create separate decoders + analyzers for each chunk, and that's not something that I can see how to build in a generic way.

FWIW though; what I built and then dropped was roughly EbuR128::analyze_stream<E>(self, samples: impl Iterator<Result<impl AsRef<[f32]>, E>>) -> Result<AnalysisResult, E>. I could cleanup and bring the implementation here, if you want to write tests and maintain it, but my guess is that others will hit the same "decoding" bottleneck that I did and this will bring more pain than gain.

Chunked analysis has slightly different requirements for true-peak and
loudness.

True-peak is passed through a 12/24-tap polyphase FIR-filter, hence needs
only 12 previous frames to be "seeded" to reach state-equivalence at a
given point.

Loudness is analyzed in chunks of 400ms audio, but is first filtered
through an IIR filter. The IIR-filter in theory has no re-synchronisation-
requirement, but seems to reach stable state in less than 100ms of audio.

Chunked analysis can thus be supported through;

1. Split your input in chunks, with 300ms overlap, and an additional 100ms
   samples to prime the filters
2. For each chunk
 - setup a new instance of EbuR128
 - prime with the first 100ms samples (except for the first chunk)
 - analyze the rest
3. Fold up the analyzer-instances from each chunk, through `try_merge`
4. The result can now be read from the resulting final analyzer
@sdroege
Copy link
Owner

sdroege commented Mar 15, 2021

I guess for that to be useful you need a really cheap compression algorithm (say, MPEG-1 Layer 2 or even cheaper like ADPCM, or WavPack) or work directly on uncompressed audio. I can see how decoding of a more sophisticated codec is strictly more expensive than measuring the loudness, even if the resampling for true peak detection is needed.

I'm not sure, it seems like a nice API to have but also doesn't seem useful in the majority of cases. Maybe you could provide it as an example, not as API, and we just put it into the examples subdirectory? :)

@sdroege sdroege merged commit 6fec0be into sdroege:master Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants