You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running several notebooks in parallel (using pathos.multiprocessing and papermill from the command line) to get data and then processing that data in a final notebook, which I call using papermill and which uses pathos.multiprocessing to crunch numbers itself. This in itself works fine, but if I try to repeat the process (say in a loop) then the second attempt to call the notebooks in parallel fails, with the notebooks not responding and eventually raising RuntimeError: Kernel didn't respond in 60 seconds. I am using python 3.6.3.
I have managed to narrow down the problem to a minimal working example: calling (through papermill) a notebook which uses multiprocessing, and after that trying to process several papermill notebooks in parallel, raises the error. In particular, this is the minimum required to consistently recreate the problem:
2 notebooks as follows:
the first, heck.ipynb, with one cell containing:
print("heck")
the second, heck_parallel.ipynb, with one cell containing:
import pathos.multiprocessing as mp
with mp.Pool() as p:
print(p.map(lambda x:x**2,list(range(5))))
Next you'll want to open python from the console and define/import the following 3 functions:
import pathos.multiprocessing as mp
import papermill as pm
def run_heck(i):
pm.execute_notebook("heck.ipynb","heck_"+str(i)+".ipynb",parameters={})
def run_heck_parallel(i):
pm.execute_notebook("heck_parallel.ipynb","heck_parallel_"+str(i)+".ipynb",parameters={})
def run_parallel_hecks(x):
with mp.Pool() as p:
p.map(run_heck,list(range(x)))
and now run the following from the console: run_heck_parallel(1) run_parallel_hecks(1)
the first will succeed, but not the second, which will eventually raise RuntimeError: Kernel didn't respond in 60 seconds.
Note that all other combinations seem to be fine, i.e. it's specifically run_parallel_hecks(1) (running several parallel notebooks) after run_heck_parallel(1) (running a notebook with multiprocessing in it) that seems to break things. for instance the series of commands: run_parallel_hecks(1) run_parallel_hecks(1) run_heck_parallel(1) run_heck_parallel(1) run_heck(1)
will be processed just fine (and so will using pathos.multiprocessing to run other things in parallel, so long as those things aren't executing notebook).
The following issue seems to be related: #239 , though it is in python 2.7 and the context seems different, so I am opening a separate issue just in case.
It looks like something with the mix between papermill and multiprocessing is not getting cleaned up properly (closing and reopening python allows me to start again). I hope there is a fix :)
Thank you!
The text was updated successfully, but these errors were encountered:
Thanks for documenting the issue well! Yes it is the same issue you referenced and boils down to ipython/ipython#11460. You are correct that that it's a little different for python 2 as ipython is locked to and older major version in this case. Likely it'll never be fully resolved for python 2, but if the fix ends up being simple enough they may back port it. I'd follow up on the thread I linked and see if there are contributors looking to help solve the issue there. I probably can't get to helping directly for a while but happy to code review or encourage other core contributors to help.
I'm running several notebooks in parallel (using pathos.multiprocessing and papermill from the command line) to get data and then processing that data in a final notebook, which I call using papermill and which uses pathos.multiprocessing to crunch numbers itself. This in itself works fine, but if I try to repeat the process (say in a loop) then the second attempt to call the notebooks in parallel fails, with the notebooks not responding and eventually raising
RuntimeError: Kernel didn't respond in 60 seconds
. I am using python 3.6.3.I have managed to narrow down the problem to a minimal working example: calling (through papermill) a notebook which uses multiprocessing, and after that trying to process several papermill notebooks in parallel, raises the error. In particular, this is the minimum required to consistently recreate the problem:
2 notebooks as follows:
the first, heck.ipynb, with one cell containing:
the second, heck_parallel.ipynb, with one cell containing:
Next you'll want to open python from the console and define/import the following 3 functions:
and now run the following from the console:
run_heck_parallel(1)
run_parallel_hecks(1)
the first will succeed, but not the second, which will eventually raise
RuntimeError: Kernel didn't respond in 60 seconds
.Note that all other combinations seem to be fine, i.e. it's specifically
run_parallel_hecks(1)
(running several parallel notebooks) afterrun_heck_parallel(1)
(running a notebook with multiprocessing in it) that seems to break things. for instance the series of commands:run_parallel_hecks(1)
run_parallel_hecks(1)
run_heck_parallel(1)
run_heck_parallel(1)
run_heck(1)
will be processed just fine (and so will using pathos.multiprocessing to run other things in parallel, so long as those things aren't executing notebook).
The following issue seems to be related: #239 , though it is in python 2.7 and the context seems different, so I am opening a separate issue just in case.
It looks like something with the mix between papermill and multiprocessing is not getting cleaned up properly (closing and reopening python allows me to start again). I hope there is a fix :)
Thank you!
The text was updated successfully, but these errors were encountered: