Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allocate memory during WaveNet processing #49

Open
mikeoliphant opened this issue Jun 5, 2023 · 8 comments
Open

Do not allocate memory during WaveNet processing #49

mikeoliphant opened this issue Jun 5, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@mikeoliphant
Copy link
Contributor

Currently, the WaveNet model processing code re-sizes vectors and matrices based on the audio buffer size during processing. This is non-ideal for real-time operation. Instead, all sizing operations should be done out-of-band of the processing loop.

In most cases, the current behavior should not cause significant problems. If there is a fixed audio buffer size the resize operations should only happen once. A fixed buffer size is not guaranteed, however - DAWs will sometimes vary the block size.

@daleonov
Copy link
Contributor

daleonov commented Jun 9, 2023

Agreed. There should be some kind of prepareBuffers() method. There's always some method that called by the DAW every time maximum block size or sample rate changes (prepareToPlay() in JUCE, and there was something similar in iPlug), so at least happens once when the session starts, and it happens outside audio thread. It's the only safe place to resize the buffers.
Also, like Mike mentioned, block size can vary. Good example is when you get looped selection in the DAW, the last block in that loop is always smaller than usual, and then the next one is back to normal size. The trick is there's usually no prepareToPlay() callback from the DAW in that case, so you have to size down, and then size back up with no memory allocation.

@mikeoliphant
Copy link
Contributor Author

Yes - typically this would be handled by allocating for the max size, but then only processing the number of samples you are given.

Eigen has a way to specify the maximum matrix size, but unfortunately it is at compile time:

https://eigen.tuxfamily.org/dox/classEigen_1_1Matrix.html

It should be possible to create matrices/vectors at max size, and then just do block operations on them at the current given size.

@sdatkinson
Copy link
Owner

Sounds reasonable--I had started to move some things out in the iPlug2 plugin, but yeah, this reeks of me getting back into C++ for this project 😅

Good example is when you get looped selection in the DAW, the last block in that loop is always smaller than usual, and then the next one is back to normal size.

Did not know--great call!

Let me know if either of you want to take this on--happy to assign it 👍🏻

@sdatkinson sdatkinson added the bug Something isn't working label Jun 11, 2023
@mikeoliphant
Copy link
Contributor Author

yeah, this reeks of me getting back into C++

I feel your pain...

Let me know if either of you want to take this on--happy to assign it

I can probably look into sorting this out.

@olilarkin
Copy link
Contributor

olilarkin commented Jan 18, 2024

I just ran NAM audiounit with Apple's auval real time safety checker and it pointed to a couple of things...


Realtime-safety violation:
                libsystem_malloc.dylib`malloc
                NeuralAmpModeler`Eigen::internal::aligned_malloc(unsigned long)+0x38
                NeuralAmpModeler`void* Eigen::internal::conditional_aligned_malloc<true>(unsigned long)+0x18
                NeuralAmpModeler`float* Eigen::internal::conditional_aligned_new_auto<float, true>(unsigned long)+0x64
                NeuralAmpModeler`Eigen::DenseStorage<float, -1, -1, -1, 0>::resize(long, long, long)+0x78
                NeuralAmpModeler`Eigen::PlainObjectBase<Eigen::Matrix<float, -1, -1, 0, -1, -1>>::resize(long, long)+0x1f0
                NeuralAmpModeler`Eigen::internal::Assignment<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::assign_op<float, float>, Eigen::internal::Dense2Dense, void>::run(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::assign_op<float, float> const&)+0x78
                NeuralAmpModeler`void Eigen::internal::call_assignment_no_alias<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::assign_op<float, float>>(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::assign_op<float, float> const&)+0x30
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>& Eigen::PlainObjectBase<Eigen::Matrix<float, -1, -1, 0, -1, -1>>::_set_noalias<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::DenseBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x3c
                NeuralAmpModeler`void Eigen::PlainObjectBase<Eigen::Matrix<float, -1, -1, 0, -1, -1>>::_init1<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::DenseBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x20
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>::Matrix<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&)+0x2c
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>::Matrix<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&)+0x24
                NeuralAmpModeler`void Eigen::internal::call_assignment<Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::add_assign_op<float, float>>(Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::add_assign_op<float, float> const&, std::__1::enable_if<evaluator_assume_aliasing<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>::value, void*>::type)+0x2c
                NeuralAmpModeler`Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>& Eigen::MatrixBase<Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>>::operator+=<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::MatrixBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x40
                NeuralAmpModeler`nam::Conv1D::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, long, long, long) const+0x174
                NeuralAmpModeler`nam::wavenet::_Layer::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, long, long)+0x70
                NeuralAmpModeler`nam::wavenet::_LayerArray::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&)+0x1bc
                NeuralAmpModeler`nam::wavenet::WaveNet::process(float*, float*, int)+0x150
                NeuralAmpModeler`NeuralAmpModeler::ProcessBlock(float**, float**, int)::$_10::operator()(float**, float**, int) const+0x4c
                NeuralAmpModeler`decltype(std::declval<NeuralAmpModeler::ProcessBlock(float**, float**, int)::$_10&>()(std::declval<float**>(), std::declval<float**>(), std::declval<int>())) std::__1::__invoke[abi:v160006]<NeuralAmpModeler::ProcessBlock(float**, float**, int)::$_10&, float**, float**, int>(NeuralAmpModeler::ProcessBlock(float**, float**, int)::$_10&, float**&&, float**&&, int&&)+0x3c

  Realtime-safety violation:
                libsystem_malloc.dylib`free
                NeuralAmpModeler`Eigen::internal::aligned_free(void*)+0x34
                NeuralAmpModeler`void Eigen::internal::aligned_delete<float>(float*, unsigned long)+0x28
                NeuralAmpModeler`Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false>::~gemm_blocking_space()+0x24
                NeuralAmpModeler`Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false>::~gemm_blocking_space()+0x1c
                NeuralAmpModeler`void Eigen::internal::generic_product_impl<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, Eigen::DenseShape, Eigen::DenseShape, 8>::scaleAndAddTo<Eigen::Matrix<float, -1, -1, 0, -1, -1>>(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true> const&, float const&)+0x360
                NeuralAmpModeler`void Eigen::internal::generic_product_impl<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, Eigen::DenseShape, Eigen::DenseShape, 8>::evalTo<Eigen::Matrix<float, -1, -1, 0, -1, -1>>(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true> const&)+0xb0
                NeuralAmpModeler`Eigen::internal::Assignment<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::assign_op<float, float>, Eigen::internal::Dense2Dense, void>::run(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::assign_op<float, float> const&)+0xa8
                NeuralAmpModeler`void Eigen::internal::call_assignment_no_alias<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::assign_op<float, float>>(Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::assign_op<float, float> const&)+0x30
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>& Eigen::PlainObjectBase<Eigen::Matrix<float, -1, -1, 0, -1, -1>>::_set_noalias<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::DenseBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x3c
                NeuralAmpModeler`void Eigen::PlainObjectBase<Eigen::Matrix<float, -1, -1, 0, -1, -1>>::_init1<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::DenseBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x20
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>::Matrix<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&)+0x2c
                NeuralAmpModeler`Eigen::Matrix<float, -1, -1, 0, -1, -1>::Matrix<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&)+0x24
                NeuralAmpModeler`void Eigen::internal::call_assignment<Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>, Eigen::internal::add_assign_op<float, float>>(Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>&, Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0> const&, Eigen::internal::add_assign_op<float, float> const&, std::__1::enable_if<evaluator_assume_aliasing<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>::value, void*>::type)+0x2c
                NeuralAmpModeler`Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>& Eigen::MatrixBase<Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1>, -1, -1, true>>::operator+=<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>>(Eigen::MatrixBase<Eigen::Product<Eigen::Matrix<float, -1, -1, 0, -1, -1>, Eigen::Block<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, -1, -1, true>, 0>> const&)+0x40
                NeuralAmpModeler`nam::Conv1D::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, long, long, long) const+0x174
                NeuralAmpModeler`nam::wavenet::_Layer::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, long, long)+0x70
                NeuralAmpModeler`nam::wavenet::_LayerArray::process_(Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1> const&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&, Eigen::Matrix<float, -1, -1, 0, -1, -1>&)+0x1bc
                NeuralAmpModeler`nam::wavenet::WaveNet::process(float*, float*, int)+0x150
                NeuralAmpModeler`NeuralAmpModeler::ProcessBlock(float**, float**, int)::$_10::operator()(float**, float**, int) const+0x4c

full transcript (from my NAM version...)

validata.txt

@olilarkin
Copy link
Contributor

if you set the EIGEN_RUNTIME_NO_MALLOC preprocessor macro and then ...

Eigen::internal::set_is_malloc_allowed(false);
dsp->process()
dsp->finalize_(nFrames);
Eigen::internal::set_is_malloc_allowed(true);

It shows that every single call to process is calling malloc/free. This is bad and fixing it might save quite a few CPU cycles, let alone preventing some potential glitches

@olilarkin
Copy link
Contributor

Maybe a clue here:

https://github.com/stulp/eigenrealtime

@rerdavies
Copy link

rerdavies commented Sep 1, 2024

I just pushed a pull-request that cleans up realtime memory allocations due to use of Eigen temporary Matrices. With the changes applied, NeuralAmpModelerCore no longer does memory allocations on any process call except the first. Net results: a 20% performance improvement (enormously valuable when running on Pi 4s'), a substantial reduction in CPU use jitter, and probably progressively worse performance as NAM, and other plugins that are unwisely doing memory allocations fragment the realtime thread's heap.

Hosts can ensure that buffers for Eigen MatrixXfs are pre-allocated by processing one sample off the realtime thread before allowing the model to run on the realtime thread. Currently, MatrixXf memory is allocated during the first processing cycle.

@sdatkinson

It might be useful to introduce an Activate() method on DSPs, the implementation of which would just run the model for one cycle. Or even do it as part of get_dsp().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants