-
-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio
and concurrent.futures.ProcessPoolExecutor
prevent kernel restarts
#274
Comments
The problem is not present if any |
There's a good chance that jupyter/jupyter_client#279 alleviates this, as it is not an uncommon situation to be in where restart doesn't fully clean up subprocesses and the result is a forked subprocess holding on to its parent's FD that the new kernel needs to restart, ultimately running into EADDRINUSE. Does it require asyncio and ProcessPool, or just ProcessPool to run into this? It's possible that we can do more to clean up process trees, either here or in jupyter_client. |
It requires both, and it requires that the |
@minrk unfortunately jupyter/jupyter_client#279 does not solve the issue (also I am using #281). |
I don't have a fix yet, but I've narrowed this down to a couple of things:
I think we should fix both gui-registration preventing clean exit (an ipykernel issue) and killing the process group (a jupyter-client issue), but either one ought to fix this. FWIW, I was able to recover my kernel thanks to jupyter/jupyter_client#279. It does fail to restart initially, but after a few retries it gets new ports and is able to come back. |
Fixing it sounds like a lot of work. But the problem is quite a bit more annoying than you would think; after you restarted sufficiently many times, the whole machine becomes unusable until you either manually kill all processes or just restart my Docker container. Would all of those things have to be solved to get it to behave? |
No, I think if any of those is fixed, it will behave in this case. But fixing them all would ensure similar issues don't crop up again. The first fix should really be in ipykernel, figuring out why we don't exit cleanly for simple cases like this. With that fixed, jupyter-client solidly shutting down process groups is a bit less urgent. It shouldn't be a lot of work, but it's a relatively complex bit of logic. I'll try to find a window to look into it. |
The central issue is the exit is handled here, where we directly stop the tornado eventloop. This isn't enough when another eventloop (asyncio in this case) is also running. Registering an additional hook to stop the integrated eventloop at the same time should do it. |
This should be doubly fixed:
|
See the following issues / MR: - ipython/ipykernel#274 - ipython/ipykernel#263 - jupyter/jupyter_client#314 ttps://github.com/jupyter/jupyter_client/pull/314# Please enter the commit message for your changes. Lines starting
The problem is illustrated in this notebook.
If I use the
asyncio
event loop integration introduced in #21 along withconcurrent.futures.ProcessPoolExecutor
then hitting the "restart" button from within the notebook interface is broken. The notebook server attempts 5 times to restart the kernel, then gives up and creates a "new" one. The previous kernel process is then orphaned, but still exists on the system (with theinit
process as the parent).This is a problem because there is a sizeable delay between hitting "restart" and being able to use a new kernel. Also this produces many orphan processes that take resources and do not go away even once the notebook server is stopped.
The text was updated successfully, but these errors were encountered: