Support for chunked analysis #42

rawler · 2021-03-08T12:10:09Z

This PR enables an application to do "chunked" analysis of long streams. When analyzing multiple hours of audio, analyzing it in chunks allows usage of multi-core to accelerate the analysis, for both decoding and the analysis itself.

Used to take reference to a `Samples`-implementing structure. This structure typically contains references in itself, so a `&impl Samples` is double indirection anyways.

src/ebur128.rs

rawler · 2021-03-10T10:02:19Z

Sorry for missing loudness_global_multiple in the first iteration of this. Turns out, all required for bit-exact parallel analysis is the seeding, to solve initialization of filter-states.

rawler · 2021-03-10T10:07:45Z

If we are willing to add a (feature-gated) dependency on rayon, I could perhaps add some "ParallelAnalyzer" implementation, that expects a stream of impl Samples, and through Rayon automatically chunks and analyzes the stream in parallel. I'm going to write such implementation for our application anyways. Let me know if that's desirable.

sdroege · 2021-03-10T10:15:55Z

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

src/ebur128.rs

src/filter.rs

src/true_peak.rs

sdroege · 2021-03-14T17:02:16Z

Thanks, looks mostly good to me :)

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

I would still be interested in this though!

rawler · 2021-03-14T19:56:23Z

That sounds interesting but I'm not sure how that would like API-wise. What did you have in mind for the API?

Sorry, I spent some days over last week building this, just to realize that decoding the audio itself is ~40% of the CPU spent in our application. So, analyzing chunks in parallel didn't really improve performance more than simply decoding in a separate thread does. If we want to chunk, we'll going to have to create separate decoders + analyzers for each chunk, and that's not something that I can see how to build in a generic way.

FWIW though; what I built and then dropped was roughly EbuR128::analyze_stream<E>(self, samples: impl Iterator<Result<impl AsRef<[f32]>, E>>) -> Result<AnalysisResult, E>. I could cleanup and bring the implementation here, if you want to write tests and maintain it, but my guess is that others will hit the same "decoding" bottleneck that I did and this will bring more pain than gain.

Chunked analysis has slightly different requirements for true-peak and loudness. True-peak is passed through a 12/24-tap polyphase FIR-filter, hence needs only 12 previous frames to be "seeded" to reach state-equivalence at a given point. Loudness is analyzed in chunks of 400ms audio, but is first filtered through an IIR filter. The IIR-filter in theory has no re-synchronisation- requirement, but seems to reach stable state in less than 100ms of audio. Chunked analysis can thus be supported through; 1. Split your input in chunks, with 300ms overlap, and an additional 100ms samples to prime the filters 2. For each chunk - setup a new instance of EbuR128 - prime with the first 100ms samples (except for the first chunk) - analyze the rest 3. Fold up the analyzer-instances from each chunk, through `try_merge` 4. The result can now be read from the resulting final analyzer

sdroege · 2021-03-15T11:09:36Z

I guess for that to be useful you need a really cheap compression algorithm (say, MPEG-1 Layer 2 or even cheaper like ADPCM, or WavPack) or work directly on uncompressed audio. I can see how decoding of a more sophisticated codec is strictly more expensive than measuring the loudness, even if the resampling for true peak detection is needed.

I'm not sure, it seems like a nice API to have but also doesn't seem useful in the majority of cases. Maybe you could provide it as an example, not as API, and we just put it into the examples subdirectory? :)

true_peak: Change to require owned "slice" of Samples

142dcf9

Used to take reference to a `Samples`-implementing structure. This structure typically contains references in itself, so a `&impl Samples` is double indirection anyways.

rawler force-pushed the chunked-support branch from 7703cd3 to 073c182 Compare March 8, 2021 13:11

sdroege reviewed Mar 9, 2021

View reviewed changes