WIP: Use new kernel management APIs in notebook server #4170

takluyver · 2018-11-06T18:20:02Z

This adapts the notebook server to use jupyter_kernel_mgmt and jupyter_protocol for interacting with kernels. This includes the 'kernel providers' discovery mechanism.

The discovery was intended as one half of a reworking of how kernels are chosen for notebooks. I don't think it makes sense to record the kernel name in notebook metadata: that name might not mean the same thing on your computer as it does on mine. But I don't yet have a good answer for how it should choose which kernel to start for a notebook.

A lot of the incidental changes here are because the kernel startup now includes getting the client connected; this exposed a number of race conditions where things happened before the kernel finished starting.

The inheritance stack was making this harder to follow.

This was already the case, but it was hidden by the tests using .msg_type at the top-level of the message, which was not updated.

kevin-bates

@takluyver - I really like where this heading. Just had a few comments, mostly in the area of consumability and backwards compatibility.

kevin-bates · 2018-11-12T18:17:41Z

notebook/notebookapp.py

+        if self.kernel_providers:
+            self.kernel_finder = KernelFinder(self.kernel_providers)
+        else:
+            self.kernel_finder = KernelFinder.from_entrypoints()


This seems a little awkward to me. How would an application be able to have providers from both the property and the entrypoints? If there was a way to get a list of entrypoint-based providers, then you could require the property be the composite.

If there was a separate way to get the providers from entrypoints, then you could just call the constructor and, if the argument is None, have it call from_entrypoints(), else the argument is the complete set.

This was essentially just for testing, though I was planning to add some sort of inclusion/exclusion lists for real configurability. The idea is that it will always normally discover kernel providers from entry points, but in certain odd situations (like testing) you might want to override it so that you can completely control kernel discovery.

Sounds good. I would recommend these approaches get pushed into the constructor and let it determine how to compose/filter the providers.

kevin-bates · 2018-11-12T18:26:19Z

notebook/services/kernels/kernelmanager.py

+        """
+        if path is not None:
+            kwargs['cwd'] = self.cwd_for_path(path)
+        kernel_id = str(uuid.uuid4())


Please restore the call to new_kernel_id(). It's important that applications be able to specify their own kernel_id. Another option may be to let the kernel provider determine its kernel id - assuming the provider also has access to kwargs (see next comment).

That should be easy enough. Can you point me to the code where you're overriding it, out of interest?

Sure: https://github.com/jupyter/enterprise_gateway/blob/master/enterprise_gateway/services/kernels/remotemanager.py#L93

That code confuses me a bit. Normally the point of a MappingKernelManager is to manage multiple kernels with different IDs, but if you're pulling the kernel ID from an environment variable, it will always be the same (unless it's being changed in process, which hopefully isn't the case). So you can only ever have one kernel in the manager? So why do you need the MappingKernelManager?

(Sorry for the delayed response)
The MappingKernelManager (well, actually, the previous MultiKernelManager) is responsible for equating a kernel manager instance with a kernel_id.

The environment for which this value is derived is a portion of the kernel creation request that comes from clients of Enterprise Gateway. In these applications, clients can influence the request. It is not the environment of the process itself.

As was noted in the next comment, by specifying things like kernel_id and kernel namespace, for example, clients can "seed" environments in which their kernels will be invoked. Since kernel_id is a perfect "key", allowing clients to specify their kernel_id is critical to enabling them to build applications as they desire based on that value.

Please restore that functionality as it is current behavior.

kevin-bates · 2018-11-12T18:29:57Z

notebook/services/kernels/kernelmanager.py

+        elif '/' not in kernel_name:
+            kernel_name = 'spec/' + kernel_name
+
+        kernel = KernelInterface(kernel_name, self.kernel_finder)


We should add kwargs to the KernelInterface() constructor. The gateway projects (Kernel and Enterprise) allow for users to specify arguments that influence kernel creation. For example, users can specify a pre-existing namespace (via KERNEL_NAMESPACE) that instructs Enterprise Gateway to launch kubernetes-based kernels into the specified namespace rather than having EG create one for them. Since KernelInterface() is the "path" to the kernel provider, we should flow kwargs.

Thanks. I may think more about the MappingKernelManager and KernelInterface APIs, and I'll try to keep this in mind if I do. I'm not entirely happy with them at present.

takluyver · 2018-11-13T12:18:17Z

Just thought about this a bit more in the context of a discussion on nbconvert.

I think the crucial question is whether and how we associate notebooks with an environment.

Normal code files are associated with a language by their file extension (.py -> Python) and if they're designed to be a command-line entry point, can be associated with an interpreter (~= environment) by a shebang line like #!/usr/bin/python. They can also be associated with whatever environment you're currently in by using #!/usr/bin/env python, and you can always override the shebang by running with an explicit interpreter: path/to/python foo.py. Shebang lines are often rewritten during installation (e.g. by pip or conda), because it's understood that they are only meaningful on a specific system.

Notebooks are associated with a language by language info in the notebook metadata. This doesn't need to change. Code will only make sense in the right language, and the claim that 'this is Python code' means the same thing everywhere. This is our equivalent of a .py extension.

We originally designed kernelspecs as if there would be one per language (or per major language version, like Python 2 or 3). Their evolution has led to them often being used to distinguish environments for the same language. So the kernelspec name in the notebook metadata effectively associates a notebook with an environment, like a shebang line.

This leads to some problems:

Lots of people don't know about the kernelspec name in the notebook metadata, and consequently don't expect a notebook to be tied to an environment. Unlike a shebang, it's added automatically whenever you save the file, and it's not visible in the editor.
It's not very obvious how to override the association if you want to run it in a different environment. Different tools have different mechanisms (e.g. nbval's --current-env flag).
Kernelspec names are even more brittle than shebangs: a shebang only depends on the filesystem path, whereas kernelspecs depend on the environment and go through an extra layer of indirection.
Notebooks are distributed without any install step where a tool could modify the metadata (like rewriting a shebang).

I think we should at least get rid of automatically saving an association with a particular environment. That leaves a couple of questions:

What happens when you open a notebook and there are multiple kernels available for its language? For command line tools like nbconvert, I'm inclined to say the user should be forced to pick. But that seems awkward for interactive use. Should we remember locally an association between paths and kernels (another kind of hidden state)? Should we have some notion of a default kernel for each language - and if so how is that chosen?
Should there be some way of explicitly associating a notebook or a directory or a 'project' (however we define that) with a particular kernel? E.g. if you do jupyter kernel-assoc . conda/foo, all notebooks opened or created under the CWD would have conda/foo as the default kernel? Where would this be stored, and how would we make it visible?

dhirschfeld · 2018-11-13T23:19:15Z

I'm inclined to say the user should be forced to pick

I think that's best in all cases. Yes, that does introduce some friction in interactive use but I think a well designed UX can alleviate that problem.

If each notebook stores only the language it uses then when opening it for the first time a dialog can pop up listing all the kernels available on the system for that language
As mentioned elsewhere, the dialog can have a remember my choice which would store the kernel association locally
There would need to be a UI to view/change/delete kernel associations
If the association isn't stored permanently it can be remembered such that upon opening the same notebook the last kernel used is already selected as the default

`KernelInterface` uses `kernel_type` instead of `kernel_name` which deviates from the previous `KernelManager`. As a result, enabling culling in the "kernel provider era" triggered exceptions once an active kernel was available.

When a kernel is culled, it's shutdown needs to go through the MappingKernelManager so that it can clean up its list of kernels. Otherwise, the same kernel will be attempted to be culled every period thereafter.

Use appropriate attribute in cull debug

Culling should go through MappingKernelManager

Use kernel_id from provider's manager

takluyver · 2019-08-27T15:45:58Z

Superseded by PR #4837.

takluyver added 30 commits October 25, 2018 17:17

Initial work towards using jupyter_kernel_mgmt for notebook server

0dc8b70

Provide kernel specs from kernel finder machinery

ae15888

Display kernel type ID alongside display name

ddcc3dd

Add dependency on jupyter_kernel_mgmt & jupyter_protocol

27fc52a

Switch single kernelspec API handler to kernel_finder

102ae7b

Drop kernel_spec_manager

5765854

Fix interrupting a kernel

1dfc5d4

Fix up kernelspec API tests to pass

76ab738

Default kernel type name pyimport/kernel

f1a79bf

Use pyimport/kernel for session API tests

b0e0fec

Use pyimport/kernel for kernels API tests

fd0b60c

Fix dummy objects for SessionManager tests

c76e6bf

Use reworked restarter callback system

2d9e021

Kernel restart always means new manager & new connection info

a15a1f9

Use updated API for message handlers

b427507

Restarter callbacks parameters swapped again

90f9e68

Clean up some unused imports

f6e7cfd

Reworked kernel restart handling

d87f862

Reorganise code for kernel websocket handler

24d3d0b

The inheritance stack was making this harder to follow.

Require jupyter_kernel_mgmt >=0.3

cdd45a6

Change selectors used by tests to create new notebook

9ed09ae

Fix race condition making AJAX requests for session

00ab20c

Fix selector for new kernel menu entry

20e9db4

Try to get more information on test failure

69a79ac

Don't try to forward message from kernel when websocket closed

d63cd96

Avoid race condition creating duplicate sessions

237a212

Fix starting a kernel when modifying a session

80c278a

Fix sessionmanager tests

0a18fb9

Propagate future out for sending status messages

c3d63d0

Don't crash closing websocket if handler not connected

65d49a4

takluyver added 8 commits November 6, 2018 11:42

Wait for kernel to be connected to send shutdown_request

76b9044

There is no longer a good way to prevent a kernel auto-restarting

c42fb21

404 error for websocket opened to nonexistant kernel

336a321

Wait for kernel for cell mode-switching tests

3ab2d12

Message for callback is converted to update_display_data

826ee65

This was already the case, but it was hidden by the tests using .msg_type at the top-level of the message, which was not updated.

Fix deserialisation test

bb54f34

Allow for kernel state changing during session test

3bc13fb

Allow execution_state to change in test_modify_kernel_id

e4eff7a

kevin-bates mentioned this pull request Nov 12, 2018

Remove blocking Kernel Startup jupyter-server/enterprise_gateway#86

Closed

kevin-bates reviewed Nov 12, 2018

View reviewed changes

takluyver mentioned this pull request Nov 13, 2018

ExecutePreprocessor using jupyter_kernel_mgmt APIs jupyter/nbconvert#809

Open

This was referenced Nov 28, 2018

How *should* other applications interface with jupyter server? jupyter-server/jupyter_server#29

Closed

Decoupling notebooks from computation? jupyter-server/enterprise_gateway#509

Closed

minrk mentioned this pull request Mar 6, 2019

[WIP] Changes necessary to get async kernel startup working jupyter/jupyter_client#425

Closed

kevin-bates added 2 commits June 7, 2019 14:21

Use appropriate attribute in cull debug

e5bb55c

`KernelInterface` uses `kernel_type` instead of `kernel_name` which deviates from the previous `KernelManager`. As a result, enabling culling in the "kernel provider era" triggered exceptions once an active kernel was available.

Culling should go through MappingKernelManager

d721064

When a kernel is culled, it's shutdown needs to go through the MappingKernelManager so that it can clean up its list of kernels. Otherwise, the same kernel will be attempted to be culled every period thereafter.

echarles mentioned this pull request Jun 12, 2019

How to use with notebook gateway-experiments/remote_kernel_provider#1

Closed

takluyver and others added 4 commits June 13, 2019 14:28

Merge pull request #1 from kevin-bates/patch-1

c7b4042

Use appropriate attribute in cull debug

Merge pull request #2 from kevin-bates/patch-2

9d3c7e0

Culling should go through MappingKernelManager

Use kernel_id from provider's manager

ab84f2a

Merge pull request #3 from gateway-experiments/jupyter-kernel-mgmt

54fdb14

Use kernel_id from provider's manager

kevin-bates mentioned this pull request Aug 21, 2019

WIP: Use new kernel management APIs in notebook server 6.x #4837

Closed

takluyver closed this Aug 27, 2019

kevin-bates mentioned this pull request Oct 2, 2019

Transition to Kernel Provider model for kernel management jupyter-server/jupyter_server#90

Closed

takluyver mentioned this pull request Oct 18, 2019

Format for kernel type ID? takluyver/jupyter_kernel_mgmt#5

Closed

github-actions bot added the status:resolved-locked label Mar 26, 2021

github-actions bot locked as resolved and limited conversation to collaborators Mar 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Use new kernel management APIs in notebook server #4170

WIP: Use new kernel management APIs in notebook server #4170

takluyver commented Nov 6, 2018

kevin-bates left a comment

kevin-bates Nov 12, 2018

takluyver Nov 13, 2018

kevin-bates Nov 13, 2018

kevin-bates Nov 12, 2018

takluyver Nov 13, 2018

kevin-bates Nov 13, 2018

takluyver Nov 13, 2018

kevin-bates Nov 25, 2018

kevin-bates Nov 12, 2018

takluyver Nov 13, 2018

takluyver commented Nov 13, 2018

dhirschfeld commented Nov 13, 2018

takluyver commented Aug 27, 2019

WIP: Use new kernel management APIs in notebook server #4170

WIP: Use new kernel management APIs in notebook server #4170

Conversation

takluyver commented Nov 6, 2018

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

takluyver commented Nov 13, 2018

dhirschfeld commented Nov 13, 2018

takluyver commented Aug 27, 2019