Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soxbindings fails when multithreading #4

Closed
rabitt opened this issue Feb 18, 2021 · 21 comments
Closed

soxbindings fails when multithreading #4

rabitt opened this issue Feb 18, 2021 · 21 comments

Comments

@rabitt
Copy link

rabitt commented Feb 18, 2021

soxbinding works great in one thread, but it looks like it consistently fails when multithreading. Minimal example below. Note that any effect (.vol, .compand, .trim, etc) triggers the same error. Running this with command line sox (replacing import soxbindings as sox with import sox) works fine.

from multiprocessing.dummy import Pool as ThreadPool
import numpy as np
import soxbindings as sox

y1 = np.zeros((4000, 1))
y2 = np.zeros((3000, 1))


def do_transform(y):
    tfm = sox.Transformer()
    tfm.vol(0.5)
    y_out = tfm.build_array(input_array=y, sample_rate_in=1000)
    return y_out


# single thread
print("running single thread")
for y in [y1, y2]:
    res = do_transform(y)
    print(res.shape)

# multithread
print("running multi thread")
pool = ThreadPool(2)
results = pool.map(do_transform, [y1, y2])

Output:

running single thread
(4000, 1)
(3000, 1)
running multi thread
Assertion failed: (fft_len == -1), function init_fft_cache, file effects_i_dsp.c, line 170.
Abort trap: 6
@pseeth
Copy link
Owner

pseeth commented Feb 19, 2021

Thanks! I can look into this. Appreciate the concise report and repro instructions! FWIW, I've been using SoxBindings in a multi-process dataloader from PyTorch with no issues, so maybe try multiple processes as a quick fix for now.

@rabitt
Copy link
Author

rabitt commented Feb 19, 2021

so maybe try multiple processes as a quick fix for now.

Unfortunately I don't think it's possible in my case - I'm using it inside a tensorflow dataloader which uses multithreading. I don't know of a way around it.

@faroit
Copy link

faroit commented Feb 25, 2021

@rabitt i guess one has to disable openmp when compiling sox -> ./configure --disable-openmp. No idea if there is anything that can be done from within bindings (after compile)...

the folks at torchaudio seemed to have the same problems: pytorch/audio#1026

@pseeth ?

@pseeth
Copy link
Owner

pseeth commented Feb 26, 2021

Thank you for that pointer @faroit! SoxBindings has a slightly different issue though - that one appears to because of a mismatch between PyTorch OpenMP and libsox OpenMP. But, good news, might have the beginnings of a fix due to the rabbit hole that led me down...

Thank you to this hero on the SoX forums.

Here's the gist, I made a context manager that I call build_flow_effects within: sox_context_manager. This context manager initializes SoX (sox_init) before doing the effects chain. It then shuts down sox (sox_quit) like this:

sox_init
build_flow_effects
...
...
...
...
sox_quit

So this works great when you're doing it in a single thread, but in a multithreaded setup, you end up with this very bad scenario:

thread 1                              thread 2
sox_init                                
build_flow_effects       
...                                       sox_init
...                                       build_flow_effects
...                                       ...
...                                       ...
sox_quit                            ...
                                         sox_quit

So the quits and inits happen interleaved which SoX really doesn't like. Like the person says in the forum:


You are initializing SoX twice.
Fixed your example by moving the sox_init() outside the loop.

Tested and working with "alsa" instead of "coreaudio".

Cheers,

-Pascal

Cheers indeed. I took the decorator off. You'll have to wrap your program function in the decorator to be threadsafe, or call initialize_sox and quit_sox at the beginning and end of your program, respectively. I'll have to figure out the best way to fix this so that single-threaded SoxBindings programs are not affected, as taking the decorator off will break things the other way. I added a test case to SoxBindings that looks at the multi-threaded case based on @rabitt's example code here.

@faroit
Copy link

faroit commented Feb 26, 2021

@pseeth thanks for looking into. Not sure if this fix would be possible to be run inside tensorflows tf.data (which is in graph mode) but its certainly a huge step forward!

@pseeth
Copy link
Owner

pseeth commented Feb 26, 2021

Hmm, not super familiar with tf.data. I'll have to take a look, but say you have a program with a main function that runs your experiment with augmentation. You should be able to just do (after some kinda fix has been deployed which takes the decorator off in SoxBindings):

from soxbindings import sox_context_manager

@sox_context_manager()
def main():
   # my great experiment goes here
   # powered by soxbindings
   # and tf.data.

if __name__ ==  "__main__":
  main()

But I'm not sure totally if that'll work, having not used tf.data. If you can point me to some code with tf.data, I can take a look and try to make sure the fix here works there too.

@rabitt
Copy link
Author

rabitt commented Feb 26, 2021

But I'm not sure totally if that'll work, having not used tf.data. If you can point me to some code with tf.data, I can take a look and try to make sure the fix here works there too.

Let me see if I can cook up a minimal tf.data example for you

@rabitt
Copy link
Author

rabitt commented Feb 26, 2021

I think this does it -

import numpy as np
import tensorflow as tf
import soxbindings as sox


def do_transform(y):
    tfm = sox.Transformer()
    tfm.vol(0.5)
    y_out = tfm.build_array(input_array=y, sample_rate_in=1000)
    y_out = tf.cast(y_out, tf.float32)
    return y_out


def transform_in_graph(y):
    return tf.numpy_function(do_transform, [y], tf.float32)


def random_noise_generator():
    for _ in range(50):
        yield np.random.uniform(size=(4000, 1))


ds = tf.data.Dataset.from_generator(
    random_noise_generator, output_types=tf.float32, output_shapes=(4000, 1)
)
ds = ds.map(transform_in_graph, num_parallel_calls=4)  # change this to 1, it succeeds
for y in iter(ds):
    print(y.shape)

Obviously in this example there's an easy workaround (num_parallel_calls=1) but when training you often need two instances of a tf.data.Dataset (for train/test) and these appear to run in multiple threads.

@pseeth
Copy link
Owner

pseeth commented Feb 26, 2021

With the changes in #5, this snippet works:

import numpy as np
import tensorflow as tf
import soxbindings as sox


def do_transform(y):
    tfm = sox.Transformer()
    tfm.vol(0.5)
    y_out = tfm.build_array(input_array=y, sample_rate_in=1000)
    y_out = tf.cast(y_out, tf.float32)
    return y_out


def transform_in_graph(y):
    return tf.numpy_function(do_transform, [y], tf.float32)


def random_noise_generator():
    for _ in range(50):
        yield np.random.uniform(size=(4000, 1))


ds = tf.data.Dataset.from_generator(
    random_noise_generator, output_types=tf.float32, output_shapes=(4000, 1)
)
ds = ds.map(transform_in_graph, num_parallel_calls=4)  # change this to 1, it succeeds

with sox.sox_context_manager(): # <- THE FIX
    for y in iter(ds):
        print(y.shape)

I'll work on getting it released ASAP! Thanks all for the snippets and pointers!

@pseeth
Copy link
Owner

pseeth commented Feb 27, 2021

@faroit, @rabitt would you mind trying these steps to see if your SoxBindings related code works?

  1. Install the branch with the fix:

    pip install -U git+https://github.com/pseeth/soxbindings.git@multithread-fix

  2. Modify your multi-threaded code using the context manager. See this part of the README for what to do.

Hopefully it works! Let me know, and then I'll merge PR #5 and release it as soxbindings==1.2.3.

@faroit
Copy link

faroit commented Mar 8, 2021

@pseeth thanks a lot, the fix works fine and can be merged as is! Unfortuntately, it still seems that the interface significantly slows down tf.data pipeline and real multiprocessing can't be utilized even if the number of parallel calls is set to a value higher than 1...

@rabitt
Copy link
Author

rabitt commented Mar 8, 2021

@pseeth (sorry for the delay) I can also confirm it's working in my setup!

it still seems that the interface significantly slows down tf.data pipeline and real multiprocessing can't be utilized even if the number of parallel calls is set to a value higher than 1...

@faroit you're totally right, though it's not soxbinding's fault. I had an offline discussion with @psobot and he looked into it a bit - the sox C library itself can't do real multithreading. Still, 1x soxbindings is ~10x faster than 1x sox!

@psobot
Copy link

psobot commented Mar 8, 2021

the sox C library itself can't do real multithreading.

Just to clarify here - the limiting factor seems to be that soxbindings doesn't release Python's global interpreter lock (GIL), meaning that even if the underlying Sox code is thread-safe (unsure), independent Python threads are prevented from executing the Sox code in parallel due to the GIL.

@pseeth
Copy link
Owner

pseeth commented Mar 8, 2021

Hmm, okay. I'll merge this fix in sometime today - do you have any suggestions for how to speed things up @psobot? I very much appreciate the advice! SoX should be threadsafe: https://sox-users.narkive.com/m1PQmcwp/is-sox-threadsafe, so I imagine it's possible. I'm not sure if I did anything too crazy with my bindings though...everything in the execution of the build function (which calls soxbindings.sox) shouldn't touch global state or anything like that. If something is locking, it might be in the bindings. I just came across this: https://stackoverflow.com/questions/60915627/is-pybind11-pyarray-object-thread-safe. And also this: https://stackoverflow.com/questions/47309688/how-to-use-pybind11-in-multithreaded-application. This would be unfortunate, but might be fixable at some point.

Edit: this might be what we need: https://docs.python.org/3/c-api/init.html#releasing-the-gil-from-extension-code. I'll try it out at some point. Merging the related PR for now, though.

@psobot
Copy link

psobot commented Mar 9, 2021

Happy to help, @pseeth! I see you edited with the correct link - that should do it. If SoX is threadsafe under the hood, then releasing the GIL before calling SoX (and re-acquiring it afterwards before returning to Python code) should be all you need.

@faroit
Copy link

faroit commented Mar 19, 2021

@pseeth did you had some time to try this out? I can certainly try to help out....

@pseeth
Copy link
Owner

pseeth commented Mar 19, 2021

Unfortunately not yet. Feels like it should just be putting those two lines to release the GIL somewhere in the C extension inside SoxBindings, though, right? Definitely would promptly CR if you made a PR, though!

@pseeth
Copy link
Owner

pseeth commented Mar 30, 2021

Sorry for the delay - 1.2.3 is now out on pip! Closing this issue for now. Please re-open if you run into any issues!

@pseeth pseeth closed this as completed Mar 30, 2021
@faroit
Copy link

faroit commented Mar 31, 2021

Sorry for the delay - 1.2.3 is now out on pip!

thats great! thanks

Closing this issue for now. Please re-open if you run into any issues!

i think we should keep this open until 👇 is addressed or rename the issue or create a new one

Unfortunately not yet. Feels like it should just be putting those two lines to release the GIL somewhere in the C extension inside SoxBindings, though, right? Definitely would promptly CR if you made a PR, though!

@pseeth
Copy link
Owner

pseeth commented Mar 31, 2021

Let's make a new issue. This issue was originally about avoiding a show-stopping error which I can say is solved (and has a different solution that should stay documented in this issue). I'll make a new one about avoiding GIL.

@faroit
Copy link

faroit commented Mar 31, 2021

sounds good. I will try to look into soon, but by c++ skills are a bit rusty ;-) @psobot ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants