New KernelLauncher API for kernel discovery system #308

takluyver · 2017-11-30T15:01:23Z

@minrk I'd like to get your thoughts on the design of this before I get into integrating this machinery with KernelManager. It's meant to address #301.

The design we worked out in #261 remains: kernel providers are classes, discovered by entry points, which can tell Jupyter about kernel types from different systems (e.g. kernelspecs, conda environments, remote machines...).

The make_manager() method defined in #261 is gone, replaced by launch() and launch_async(). These return kernel launcher objects (better names welcome), which offer a subset of Popen methods, plus get_connection_info(), which returns a dictionary of connection info (the same info you get from a connection file).

Why a new interface? I wanted to use KernelManager, and split off the subclasses QtKernelManager and IOLoopKernelManager as separate functionality outside of the manager. But KernelManager has grown all kinds of complexity, like sending messages over the control channel, which you can do even if you didn't start the kernel. So I plan to make KernelManager work with owned kernels (where it has a KernelLauncher) and non-owned kernels (where it does not).

Async: so far, it has been OK for kernel control to be mostly synchronous. With increasing flexibility in how kernels are launched, this may be more painful. But I don't want to make N kernel providers support M event loops. So, asyncio. Tornado is moving in that direction, there's an asyncio interface for Qt event loops (quamash), and we're planning for our applications to require Python 3 in the next couple of years. I have also implemented an async wrapper, which runs the blocking kernel manager in a separate thread, so kernel providers only need to implement the blocking launcher interface - but there may be efficiency/reliability benefits to implementing an async interface rather than wrapping a blocking interface.

Where next?

Complete the refactoring: make KernelManager use the (blocking) PopenKernelLauncher to launch kernels for code using KernelManager to start kernels. Allow passing a KernelLauncher into a KernelManager, for code using the discovery mechanism to start kernels.
Non-owned kernels: discovery mechanisms, and support creating a KernelManager for an already-running kernel.
New launch protocol: doing this reminded me that the way we bind ports, then release them and start a kernel to bind them again, is cumbersome and error prone. I would like to design a mechanism for the kernel to pick its own random ports and then tell the parent process about them.
KernelLauncher socket(s), AKA return of the revenge of the undead 'kernel nanny' - protocol to ask a remote kernel launcher to deliver signals to a kernel, and for it to notify other clients when the kernel dies.
Capturing stdout/stderr - one day.

Code duplication: The new launcher2 module contains quite a bit of duplicated code for launching kernels in a subprocess. The steps for launching a kernel are currently split between the manager, connect and launcher modules, and pulling them altogether was the only way to get a clear view of what's actually happening. I hope to eliminate the duplication again, but it's non-trivial because the code is written with a lot more flexibility than it probably needs.

Version 6: These changes will become jupyter_client 6.0.

MetaKernelFinder -> KernelFinder

Prototype new kernel discovery machinery

Fix typo in documentation.

The old URL points to a "This page has moved"-page

Updated URL for Jupyter Kernels in other languages

- use IOLoop.current over IOLoop.instance - drop removed `loop` arg from PeriodicCallback - deprecate now-unused IOLoopKernelRestarter.loop

- interrupt_mode="signal" is the default and current behaviour - With interrupt_mode="message", instead of a signal, a `interrupt_request` message on the control port will be sent

prepare for tornado 5

Additional to the actual signal, send a message on the control port

takluyver · 2017-11-30T16:11:28Z

The test failure on Python 3.3 is due to a problem with pytest: pytest-dev/pytest#2966

The latest pytest release dropped support for Python 3.3, but a packaging problem as yet unknown means that that is not showing up in the metadata, so pip tries and fails to install the latest version.

As this branch is intended to be for jupyter_client 6.0, I'm inclined to drop the 3.3 tests, but we may need to work around it for 5.x if pytest doesn't fix it.

minrk

Nice! I like the simple launcher API.

So I plan to make KernelManager work with owned kernels (where it has a KernelLauncher) and non-owned kernels (where it does not).

I'm not sure about this. I think KernelManager should only work with owned Kernels. Everything that works with remote Kernels should be part of KernelClient. So rather than having Manager, Launcher, and Client, we should have just two: Manager + Client in early forms, or Launcher + Client in this new API. The primary motivation for the original KernelNanny proposal was for KernelClient to get all functionality for dealing with a Kernel, regardless of remote or local (interrupt, restart being the main missing pieces), and KernelManager would only be the implementation of managing a local process. This new Launcher API could take us there but it seems to me like it should mean dropping KernelManager entirely, rather than adding a third API. What do you think?

Or do you think there's enough logic that belongs in KernelManager and not in Client that Manager should get these extra abstractions around Launchers and stick around?

Async

👍 I'd love lots of test coverage for the new APIs.

Code duplication

I think code duplication is a good route to go for an upgrade to a new API. It allows us to clearly isolate and improve what the new implementation does without fear of breaking what the older implementations did. And gives us a clearer path to deprecation and eventual removal of the older APIs.

minrk · 2017-12-04T10:09:35Z

jupyter_client/async_launcher.py

+    def wait(self):
+        """Wait for the kernel process to exit.
+        """
+        raise NotImplementedError()


Why is only this method NotImplemented, while the others pass?

A spec for the return value of wait would be useful.

minrk · 2017-12-04T10:16:13Z

jupyter_client/launcher2.py

+    """
+    buf = os.urandom(16)
+    return u'-'.join(b2a_hex(x).decode('ascii') for x in (
+        buf[:4], buf[4:]


4 + 4 = 8, not 16. Typo?

minrk · 2017-12-04T10:21:28Z

jupyter_client/async_launcher.py

+        return (yield from self.in_default_executor(self.wrapped.launch))
+
+    @asyncio.coroutine
+    def wait(self):


make sure we inherit docstrings

minrk · 2017-12-04T10:36:02Z

Also +1 to dropping Python 3.3 in 6.0

takluyver · 2017-12-04T11:00:53Z

It does make sense for KernelManager to be for only an owned kernel. I'll have a look at moving some pieces from KernelManager to KernelClient. Maybe that will be enough to allow unifying kernel launchers with kernel managers.

takluyver · 2018-02-12T14:23:57Z

Thanks Yuvi. To clarify, would that be one kernel per container(/pod), so docker run ... or an equivalent command would start the kernel directly? Or would the container be a longer-lasting thing that can start and stop multiple kernels inside itself?

yuvipanda · 2018-02-12T19:20:42Z

One kernel per container/pod, so containers/pods will be as ephemeral as kernels.

…

On Mon, Feb 12, 2018 at 6:23 AM, Thomas Kluyver ***@***.***> wrote: Thanks Yuvi. To clarify, would that be one kernel per container(/pod), so docker run ... or an equivalent command would start the kernel directly? Or would the container be a longer-lasting thing that can start and stop multiple kernels inside itself? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#308 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23kPxJ2eRa0IOH8OX-RTOvg3QMQW_ks5tUEl-gaJpZM4Qwr4o> .

-- Yuvi Panda T http://yuvi.in/blog

rgbkrk · 2018-02-13T17:07:10Z

I should probably watching this PR with more regularity and contribute / review where I can. 😄

takluyver · 2018-02-13T18:18:18Z

@yuvipanda I've made a rough prototype kernel provider to start a docker container locally and connect to it: https://github.com/takluyver/jupyter_docker_kernels

This is very much a prototype, and it uses docker directly rather than any of the higher level management tools, but hopefully it gives you some idea of what it would take to use this API for docker.

takluyver · 2018-02-14T18:53:08Z

I'm starting to wonder whether, rather than having KernelManager2, KernelClient2, JupyterConsoleApp2 inside jupyter_client version 6, we should develop these new APIs as a separate package with a new name (like jupyter_client2). That might give us more freedom to make some releases while we're still experimenting with the APIs.

jankatins · 2018-02-20T00:41:34Z

I converted https://github.com/Cadair/jupyter_environment_kernels to use the new infrastructure: Cadair/jupyter_environment_kernels#35

I've implemented it with two providers: one for conda, one for virtualenv. conda actually searches for python and IRkernel kernels.

Here are some observations:

Currently we use the normal LoggingConfigureable and traitlets to configure the environment discovering (blacklist envs, set some paths, ...), but this is not anymore possible as it seems the two metaclases clash.

For now I have locally removed the ABCMeta from KernelProviderBase to make it work. Not sure if there is a better way. I would really love to keep using traitlets as a config mechanism in the environment kernel providers.

Error:

Error loading kernel provider
Traceback (most recent call last):
  File "/home/js/external/jupyter_client/jupyter_client/discovery.py", line 133, in from_entrypoints
    provider = ep.load()()  # Load and instantiate
  File "/home/js/.binaries/miniconda3/envs/environment-kernel-test/lib/python3.6/site-packages/entrypoints.py", line 77, in load
    mod = import_module(self.module_name)
  File "/home/js/.binaries/miniconda3/envs/environment-kernel-test/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/js/external/jupyter_environment_kernels/environment_kernels/env_kernel_provider.py", line 19, in <module>
    class BaseEnvironmentKernelProvider(KernelProviderBase, LoggingConfigurable):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

We periodically update the list of environment kernels in a tornado background task, as this takes quite a while (calling conda, iterating over several dirs and try to start a py/r kernel in it). This is fine when we are running in a notebook server, but does not make sense when e.g. started via jupyter kernel whatever. I think it would be nice if the KernelFinder could take over the triggering (via a explicit start_updater() or so which the notebook should call) and the KernelProviderBase gets a update_cache() method which would do the updating in the background. Another idea would be that the KernelProviderBase gets a keyword arg longrunning or so, that the provider can implement the updater itself in case the calling app needs it.
Activating an environment before running a kernel is now much cleaner :-)
Why are the ressource dir not anymore part of the interface? I think a logo is needed and IRkernel has some nice javascript files which enhance the keyboard layout in an R notebook (that will be an interesting challenge to get these served form a SSH kernel or a docker container :-) ).
I've locally implemented jupyter kernel --list to get a list of all available kernels. Would you take such a change?

jankatins · 2018-02-20T00:45:08Z

jupyter_client/launcher2.py

+    env.pop('PYTHONEXECUTABLE', None)
+
+    if extra_env:
+        print(extra_env)


debug print...

kevin-bates · 2018-03-09T23:54:57Z

Sorry for the timing (and verbosity)...

tl;dr We're doing similar things in order to support remote kernels and very interested in this effort.

Regarding the use case for remote kernels, this is almost entirely what Jupyter Enterprise Gateway provides - remote kernel management.

Kernels used in data sciences tend to consume large amounts of resources. By supporting remote kernels (accessed via remote notebooks using NB2KG) we're able to better leverage cluster resources by spreading the kernels across the cluster. Since we cater to Spark-based analytics, we currently support the YARN resource manager in both client and cluster mode. In cluster mode, we let YARN determine where the kernel is going to land within the cluster. We also have an ssh-based "distributed" implementation.

Because we want to avoid having to modify kernels, we wrap kernels in language-specific kernel launchers (yes, overloaded). I believe these provide similar functionality to the Nanny functionality. We also create a 6th "communication" port that is used for invoking interrupts and another means of conveying shutdown actions, etc. The message-based interrupts give us most of this functionality, but not all kernels support those.

Each of the types of Resource Managers can be plugged into the framework via the notion of process proxies - which, I believe, are akin to the kernel providers and abstracts the process. The process proxy is responsible for startup confirmation (discovery), monitoring (poll), interrupt conveyance and termination (should the normal mechanisms fail). Which kind of process proxy should be used for a given request is conveyed via extensions to the kernelspec format. These extensions also provide a means of conveying per-kernel configuration values - which I believe is similar to the newly added metadata entry.

If you look into our repo, you'll notice we derive from KernelManger, MultiKernelManager and KernelSpec in order to facilitate this plugability and discovery into the existing framework. Our process-proxy instance essentially replaces the 'proc' set to self.kernel during launch.

As a result, we are very interested in this work.

jankatins · 2018-05-06T15:07:07Z

@takluyver What is the roadmap for this and https://github.com/takluyver/jupyter_kernel_mgmt + https://github.com/takluyver/jupyter_protocol ? Will jupyer_client go away?

Also: how do I get jupyter_kernel_mgmt into a normal notebook server?

takluyver · 2018-05-06T20:42:00Z

@takluyver What is the roadmap for this and https://github.com/takluyver/jupyter_kernel_mgmt + https://github.com/takluyver/jupyter_protocol ? Will jupyer_client go away?

My thinking is that jupyter_client won't go beyond version 5.x releases, and the two new packages (jupyter_protocol and jupyter_kernel_mgmt) will gradually replace it. I couldn't keep enough stuff in my head at once to build what I think we need while respecting backwards compatibility, so there's an API break, and downstream code will have to adapt to use the new system.

Also: how do I get jupyter_kernel_mgmt into a normal notebook server?

It will need changes to the notebook server code. And there's an extra bit that I haven't really worked out for the notebook server: how to pick a kernel to start when opening an existing notebook. I think this is at the heart of why so many people get confused by our current system.

I'm planning to start trying to integrate this system into nbconvert first, because that's a relatively simple, self-contained use case for running a kernel.

jankatins · 2018-05-07T08:40:23Z

@takluyver Could you specify what "won't go beyond version 5.x releases" means?

I'm trying to decide if it makes sense to already integrate this into jupyter_environment_kernels nowish or if I should wait (e.g. until it is testable in a jupyter notebook -> we have a use case that we want to update the list during runtime as we want new environments to be picked up during the lifetime of a notebook server).

takluyver · 2018-05-07T09:19:07Z

I mean that, if we follow this plan, there will probably never be a jupyter_client version 6, but we'll likely still do some more bugfix releases of jupyter_client.

I'd be keen for you to try updating jupyter_environment_kernels to the new API, to see if it makes sense for someone other than me. See jupyter_ssh_kernels and jupyter_docker_kernels for examples of how it can work. But it's not something you can give to users yet, and the APIs might still change before it's ready.

jankatins · 2018-05-07T09:26:55Z

Ok, will try to get that done and only do tests with the jupyter kernel --list or so.

takluyver · 2018-05-07T09:52:17Z

Thanks! I'm working now on integrating it with a branch of nbconvert for a more interesting test case. Once that's working, I'll also get those packages on PyPI so that it's a bit easier to test with them.

Feel free to ask about the new APIs; it's all still rather messy, and I haven't written much about it. The readmes of j_protocol and j_kernel_mgmt have a few details.

takluyver · 2018-05-07T20:10:07Z

@jankatins I've now got a branch of nbconvert working with the jupyter_kernel_mgmt API instead of jupyter_client: jupyter/nbconvert@master...takluyver:jupyter_kernel_mgmt

It needs an up to date version of jupyter_kernel_mgmt from my Github repo, because I've been discovering and fixing problems in that code as I worked on nbconvert. There were some hard to debug async issues to work out.

takluyver · 2018-05-08T08:34:58Z

And I've just put jupyter_protocol and jupyter_kernel_mgmt on PyPI, both at version 0.1 to emphasise that they're not yet stable.

mpacer · 2018-05-08T19:56:41Z

@takluyver if there are hard-to-debug async issues… would it be easier if we were to use stdlib asyncio and async/await and make the new kernel mechanisms (jupyter_protocol and jupyter_kernel_mgmt) python3 only?

From what I understood notebook 6.0 is going to be python3 only & ipython >6 is already python3.

On top of that… the biggest reason I could imagine wanting to keep it python2 compatible is if we wanted to move all our python2 dependencies to use the new kernel management system and deprecate the jupyter_client system. Maybe I'm being pessimistic, but it seems unlikely that we're going to be able to completely drop jupyter_client support before 2020. That means we could leave jupyter_client to be the mechanism for people wanting python2 support to use kernels, and new things could instead use the great new python3 only libraries :).

takluyver · 2018-05-09T08:19:40Z

Yup, I agree, and I'm actually already using asyncio. The two new packages currently require Python 3.4 or above. In fact the code is a bit of an ugly mixture of asyncio parts and tornado parts, since they now run on the same event loop. pyzmq has more convenient integration (ZMQStream) with tornado's API than it does with asyncio's.

The main problem that I eventually figured out was the classic ZMQ slow subscriber issue, where you miss some messages on a PUB-SUB socket because they're sent before the subscription updates. We were declaring the client 'ready' when it got a kernel info reply, but that doesn't necessarily mean the iopub socket is getting output. So I now made it repeatedly send kernel_info_requests until it gets something (probably a status message) on iopub. That does rely on the kernel sending status messages.

SylvainCorlay · 2020-02-21T18:40:29Z

@takluyver I presume these kernel_discovery PRs to jupyter_client ought to be closed now that this work is being doine in kernel_mngt and jupyter_protocol?

blink1073 · 2021-08-23T09:37:35Z

Thanks again for pushing on this @takluyver. Closing in favor of the Kernel Provisioning and Parameterized Kernel Launch work.

takluyver and others added 25 commits October 9, 2017 15:23

Make a start on kernel discovery framework

d67c82a

Undeprecate KernelManager.kernel_cmd, add extra_env

6406393

Use entry points to find kernel finders

6ca3ec7

Tests for kernel discovery machinery

dddda32

Use older ABC definition style with metaclass

1509dac

Rename kernel finders -> kernel providers

38ccbdc

MetaKernelFinder -> KernelFinder

Missed a rename

3c09a57

Add dependency on entrypoints

e92e5c1

Document new kernel providers system

aad40cb

Break it up a bit with a subheading

c09b8ac

Update doc with Carol's suggestions

cc8176b

Fix variable name

1f74c5f

Merge pull request jupyter#261 from takluyver/discovery

8f7a865

Prototype new kernel discovery machinery

Fix typo in documentation.

16608fc

Merge pull request jupyter#299 from didiercrunch/patch-1

4776e8c

Fix typo in documentation.

Updated URL for Jupyter Kernels

936dfe0

The old URL points to a "This page has moved"-page

Merge pull request jupyter#300 from frelon/patch-1

2822872

Updated URL for Jupyter Kernels in other languages

tornado 5 support

aca5f70

- use IOLoop.current over IOLoop.instance - drop removed `loop` arg from PeriodicCallback - deprecate now-unused IOLoopKernelRestarter.loop

Configure interrupt mode via spec.

172d6cd

- interrupt_mode="signal" is the default and current behaviour - With interrupt_mode="message", instead of a signal, a `interrupt_request` message on the control port will be sent

Update docs.

f0e33ba

Bump protocol version.

21b9569

disable pyzmq zero-copy optimizations during session tests

6674afa

Merge pull request jupyter#304 from minrk/tornado-5

7a0278a

prepare for tornado 5

Fix signal name.

e2772bd

Merge pull request jupyter#294 from filmor/interrupt

0d7d00f

Additional to the actual signal, send a message on the control port

takluyver added this to the 6.0 milestone Nov 30, 2017

minrk reviewed Dec 4, 2017

View reviewed changes

minrk mentioned this pull request Feb 12, 2018

Include minor version in KERNEL_SPEC ipython/ipykernel#301

Closed

This was referenced Feb 13, 2018

Prototype new kernel discovery machinery #261

Merged

Improve logic for restarting kernels with new ports during initial startup #348

Open

Don't rely on KernelManager having a log attribute

ca78b7e

jankatins reviewed Feb 20, 2018

View reviewed changes

jupyter_client/launcher2.py

env.pop('PYTHONEXECUTABLE', None)

if extra_env:

print(extra_env)

Copy link

jankatins Feb 20, 2018 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug print...

jankatins mentioned this pull request Feb 20, 2018

WIP Prototype adjustments to jupyter/jupyter_client #308 Cadair/jupyter_environment_kernels#35

Closed

takluyver mentioned this pull request Feb 27, 2018

nbconvert kernel_name does know about conda env kernels jupyter/nbconvert#515

Open

dhirschfeld mentioned this pull request Apr 12, 2018

Using the JupyterHub API purely to provision notebook servers without the UI jupyterhub/jupyterhub#1758

Closed

jankatins mentioned this pull request Sep 12, 2018

Jupyter 6.0 kernel provider mechanism anaconda/nb_conda_kernels#105

Merged

SylvainCorlay force-pushed the master branch from b1fe8e4 to 3e8ee4a Compare February 21, 2020 18:38

blink1073 closed this Aug 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New KernelLauncher API for kernel discovery system #308

New KernelLauncher API for kernel discovery system #308

takluyver commented Nov 30, 2017

takluyver commented Nov 30, 2017

minrk left a comment

minrk Dec 4, 2017

minrk Dec 4, 2017

minrk Dec 4, 2017

minrk Dec 4, 2017

minrk commented Dec 4, 2017

takluyver commented Dec 4, 2017

takluyver commented Feb 12, 2018

yuvipanda commented Feb 12, 2018 via email

rgbkrk commented Feb 13, 2018

takluyver commented Feb 13, 2018

takluyver commented Feb 14, 2018

jankatins commented Feb 20, 2018 •

edited

Loading

jankatins Feb 20, 2018 •

edited

Loading

kevin-bates commented Mar 9, 2018

jankatins commented May 6, 2018

takluyver commented May 6, 2018

jankatins commented May 7, 2018 •

edited

Loading

takluyver commented May 7, 2018

jankatins commented May 7, 2018

takluyver commented May 7, 2018

takluyver commented May 7, 2018

takluyver commented May 8, 2018

mpacer commented May 8, 2018

takluyver commented May 9, 2018

SylvainCorlay commented Feb 21, 2020

blink1073 commented Aug 23, 2021

New KernelLauncher API for kernel discovery system #308

New KernelLauncher API for kernel discovery system #308

Conversation

takluyver commented Nov 30, 2017

takluyver commented Nov 30, 2017

minrk left a comment

Choose a reason for hiding this comment

minrk Dec 4, 2017

Choose a reason for hiding this comment

minrk Dec 4, 2017

Choose a reason for hiding this comment

minrk Dec 4, 2017

Choose a reason for hiding this comment

minrk Dec 4, 2017

Choose a reason for hiding this comment

minrk commented Dec 4, 2017

takluyver commented Dec 4, 2017

takluyver commented Feb 12, 2018

yuvipanda commented Feb 12, 2018 via email

rgbkrk commented Feb 13, 2018

takluyver commented Feb 13, 2018

takluyver commented Feb 14, 2018

jankatins commented Feb 20, 2018 • edited Loading

jankatins Feb 20, 2018 • edited Loading

Choose a reason for hiding this comment

kevin-bates commented Mar 9, 2018

jankatins commented May 6, 2018

takluyver commented May 6, 2018

jankatins commented May 7, 2018 • edited Loading

takluyver commented May 7, 2018

jankatins commented May 7, 2018

takluyver commented May 7, 2018

takluyver commented May 7, 2018

takluyver commented May 8, 2018

mpacer commented May 8, 2018

takluyver commented May 9, 2018

SylvainCorlay commented Feb 21, 2020

blink1073 commented Aug 23, 2021

jankatins commented Feb 20, 2018 •

edited

Loading

jankatins Feb 20, 2018 •

edited

Loading

jankatins commented May 7, 2018 •

edited

Loading