-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restarter issues with slow to start kernels #715
Comments
I'm wondering if this assessment is a 100% accurate, so this might need to be double-checked. |
Thanks for opening this issue @vidartf - I agree with your assessments. Aside from tweaking some of the boolean usages (and the incorrect grammar of Might there be a way to interleave some of this with the pending kernels changes (and the |
We've been having some quite bad issues with the restarter when a slow-to-start kernel fails (due to "address already in use" error which is also made worse by the slow to start kernel, since there is more time for another process to take the port in question). I've tracked down the restarter problems to a few points:
PeriodicCallback
to poll the state of the kernel. If the callback takes longer to complete than the period at which it is called, it will prevent any subsequent calls until the running one completes. However, this only holds true for synchronous callbacks, which breaks down for theAsyncIOLoopKernelRestarter
which is now the default. Consequently, thepoll
method is bascially reentrant for slow-starting kernels (default period for polling is 3s).poll
method only sets the_restarting
flag after therestart_kernel
method has returned, which means that when it is reentered due to the poll not being awaited, it will trigger a new restart while the previous restart is still running.Separately:
self.kernel_manager.is_alive()
call will succeed before the "address already in use" error causes the kernel to fail startup. This means that the_initial_startup
flag gets set toFalse
before it dies, which means it keeps trying the same ports again and again, even if therandom_ports_until_alive
flag is set. Ideally,_initial_startup
would only be toggled after the connections have all been established.KernelManager.autorestart
flag is set toFalse
, clients never receive any information when the kernel dies, since it is the restarter that monitors for and sends thedead
status. Ideally, this would still be sent even if we say that we don't want the kernel to automatically restart when it dies.The text was updated successfully, but these errors were encountered: