Proposal for kernel provisioning and gateway investigations #3

parente · 2015-09-15T00:34:26Z

Kernel gateway incubation proposal.

parente · 2015-09-15T00:36:00Z

/cc @blink1073, @sccolbert since you both indicated some interest on the original hackpad

sccolbert · 2015-09-15T01:14:42Z

Cheers for the ping!

blink1073 · 2015-09-15T16:49:13Z

Thanks for the ping @parente, do you envision any specific changes that need to be made to jupyter-js-services?

rgbkrk · 2015-09-15T17:02:01Z

@blink1073 Probably not. It seems like it has good separation for kernels to work independent of the rest of the notebook services.

parente · 2015-09-15T17:43:44Z

Kernel specs have a different meaning when you're potentially launching kernels within containers with others stuff installed in them (e.g.., it's not just "python3", but "jupyter/python3-kernel" with matplotlib, scipy, ...). If a client going to use the kernelspecs API to discover what it can launch, the response format for that endpoint might need to change. (Or we shouldn't repurpose that endpoint for kernel-containers.)

I can also imagine wanting to pass additional information to the provisioner like "allocate these CPU, disk, RAM, ... resources to the kernel you launch".

These are the two that came to mind. Part of the investigation here is to find if there's more or if these are even valid concerns.

blink1073 · 2015-09-15T17:47:46Z

It seems like all of that could be handled through another end point, which creates a set of kernelspecs conforming to the current API.

jasongrout · 2015-09-15T17:51:14Z

Another application of this sort of thing for even single-computer users is to create kernels executing in various conda or virtual environments. For example, such a provisioning service might enable the user to pick from kernels inside of available conda environments, and automatically update as new environments are created, etc. This is sort of like your "kernel command+environment/dependencies" example above.

parente · 2015-09-15T17:56:23Z

If we're talking notebooks as clients, portability of kernel spec references becomes a bit of concern too. If my notebook runs against the "jupyter/python3-with-full-scipy-stack" container on some provider, but it's only captured as "python3" in the metadata, it's not enough for reproducibility. Flipping it, if the kernel name is captured as "jupyter/python3-with-full-scipy-stack" then it may be difficult for anyone else to re-run my notebook (i.e., how do I get that env?)

But these are problems with notebooks today too. The kernelspec captures the language, but nothing official captures all the other dependencies that make the notebook work.

rgbkrk · 2015-09-15T18:21:24Z

The kernelspec captures the language, but nothing official captures all the other dependencies that make the notebook work.

That's a bingo!

jasongrout · 2015-09-15T19:17:09Z

We're quickly evolving to a full hashdist-like dependency list!

rgbkrk · 2015-09-15T19:18:26Z

hashdist + computational resource in this case

/cc @ahmadia

jasongrout · 2015-09-15T19:20:42Z

I don't think we want to write an entire packaging tool here. But perhaps a kernel could be: "hashdist/49ab4bdeff3c"+computational resources. Let something like hashdist or conda do it's job to reproduce an environment (possibly with arbitrary metadata in the kernel spec, where they could store distribution-specific metadata).

jasongrout · 2015-09-15T19:23:05Z

In fact, I think I saw somewhere a command that will inject a list of conda dependencies into notebook metadata.

ahmadia · 2015-09-15T19:31:59Z

Yes, there are unofficial tools for both hashdist and conda to inject dependencies into the metadata of the Notebook. Right now they are loosely connected with the kernels. One thing that isn't clear to me is how we can expose the available conda/hashdist/etc. environments as kernels to IPython as part of our installation process.

minrk · 2015-09-16T09:11:45Z

And I've written a script that builds a hashdist profile with a kernel and registers it as a kernelspec. I also have a tool for registering kernels from conda/virtualenvs. Both of these are IPython-specific, and not generalized to other kernels.

I do think that this sort of thing belongs at a level below the kernelspec. That is, as far as Jupyter is concerned it's just a kernelspec like any other, and it's another tool that's responsible for taking some spec and building a kernel for it. Ideally, to me, this is all using existing specs - Dockerfiles, hashdist, conda envs, requirements.txt, etc. and we don't make ourselves responsible for defining yet another environment spec.

@rgbkrk @freeman-lab how much of this is overlapping with binder? Are there things we can re-use?

minrk · 2015-09-16T09:14:55Z

kernel_gateway/proposal.md

+
+1. Using jupyter_client, jupyter_core, and pieces of jupyter/notebook (e.g., MappingKernelManager, etc.) to construct a headless kernel gateway that can talk to a cluster manager (e.g., Mesos).
+2. Implementing a websocket to 0mq bridge that can be placed in any Docker container that already runs a kernel, to allow web-friendly access to that kernel.
+3. Adding a new jupyter_client.WebsocketKernelManager that can be plugged into Jupyter Notebook or consumed by other tools to talk to kernels frontend by a websocket to 0mq bridge. (See use case #3 below).


I think I disagree that the server talking upstream via websockets is the right approach. I think it should be either:

the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or

the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.

the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or

If by kernel service you mean a kernel in a container with a websocket-to-zmq bridge (w2z?), then, yes, that's one of the planned experiments. This approach separates out kernel provisioning from kernel communication after provision, which is attractive. However, it punts the problem of managing comm with an number of running kernels out of scope unless there's a third component like the configurable-http-proxy for tmpnb, one that the provisioner informs about running kernels. That or the admin of the kernel service must bring his/her own proxying scheme.

All of the above is fine, but I think having an all-in-one gateway service that does the provisioning and the w2z bridging in one component might provide an easier walk-up-and-try it prototype in the short term. Granted, it is more monolithic and certainly has its own scaling problems, but I see it as a valuable for proving the concept and enabling folks to start thinking about "how could I use this?"

the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.

We've certainly talked to within-cluster remote kernels using zmq before. But, from experience, when we've started toying with clients being very remote from kernels (e.g., client on my laptop, kernel in an IaaS), kernels being offered as services by cloud providers, and applications that use kernels written by new audiences (e.g., web developers), we see Websockets having a number of advantages:

Proxying websockets is simpler than zmq with robust tools like nginx, haproxy, etc.

Multiplexing/demultiplexing websockets over a single proxy port is easier than zmq (e.g., to reduce port footprint for security)

End-to-end encryption for Websockets across potentially multiple proxy hops from client to kernel has solutions familiar to DevOps folks

Websockets are more familiar than zmq to the web developer audience we're looking to enable to build new kinds of applications that use kernels

I would just add that creating clients that communicate directly to kernels (in languages other than Python) through 0MQ is not a trivial exercise. The Jupyter codebase already does a good job of abstracting through WebSockets. Why not leverage that?

And I agree... a WebSockets interface is going to be more friendly to app developers.

I mainly think the first option should be better - client talks directly to the external kernel service via websockets. What I don't think we should do is make the existing server a websocket client of other web services.

I think we might be talking cross each other with our definition of client and server and for which scenario.

I mainly think the first option should be better - client talks directly to the external kernel service via websockets.

If by client you mean, for example, a JS app using jupyter-js-services, then yes. Or did you have another specific client in mind?

What I don't think we should do is make the existing server a websocket client of other web services.

Do you mean the Jupyter Notebook Python server here, specifically? If so, how would it take advantage of the remote service without talking to it via websockets?

But this requires the kernel provider to have knowledge of notebook flags, correct? (assuming the provider becomes the websocket endpoint for the outside world)

But this requires the kernel provider to have knowledge of notebook flags, correct?

It's what I currently deal with in tmpnb. I'm assuming we'll have a simple flag on the kernel provisioner or other options, not the notebook flag.

Got it. Thanks!

What's the game plan for securing the cross-origin websocket connection?

I think answering this question and others is part of the exploration TBD in the incubator. I certainly don't have a solid game plan yet. I think initial reference implementations can deal with the open access case and from there we can start to work on things like security.

That said, I can imagine having the provider support options for authenticating and authorizing requests for kernel provisioning (Does Pete get to request another kernel on my system and has he used up his allotment?) as well as kernel connectivity (Is this Pete connecting to his kernel via a websocket?) through common mechanisms (auth headers, API key, ...) But I can also imagine punting this responsibility to other components, like a front proxy that controls access to the APIs and specific kernel websocket routes based on login. Both seem viable at face-value.

Oh right, for the authed version we need to do auth like we do in the notebook, though likely API key based.

freeman-lab · 2015-09-16T23:52:17Z

@parente this is an awesome effort! And very compatible with what we’re up to.

Broadly, the work so far on Binder provides at least one way to go from an environment specification to a container that can be deployed. Completely agree with @minrk that we don’t want to invent a new spec, and we’ve trying to support as many existing ones as possible.

The API we’re trying to iron out with @rgbkrk is meant to standardize both:

how to turn a set of configuration specs (e.g. requirements.txt, conda environment, external services, etc) into a container as per above
how to deploy and inspect deployments of containers, including pre-allocated pools of containers for “hot” launches

Currently the only “application” is the notebook, but we’d love to get to a point where Binders can target others, like light-weight web apps. And that should integrate really nicely with what’s described here, in particular, the websocket to 0mq bridge.

Instead of building all our images off of a monolithic base containing many kernels, we’d love to break them out into one image per kernel + dependencies, and then communicate with many of them through the gateway. In this model, Binder could be used to specify and deploy a pool of kernel containers, but the notebook (in our current setup) would be replaced with the gateway, which could then be the endpoint for a wider variety of client applications. In other words, if this gateway existed, Binder could start using it. Developing this wasn’t on our immediate roadmap, but we could totally help prototype / code!

Hope that helped clarify!

w/ @andrewosh

ellisonbg · 2015-09-18T20:40:42Z

kernel_gateway/proposal.md

+
+1. A _client_ (e.g., Thebe) that includes JavaScript code from Jupyter Notebook to request and communicate with kernels
+
+2. A _spawner_ (e.g., tmpnb) that provisions gateway servers to handle client kernel requests


Would not jupyterhub also fall into this category of a spawner?

I'm not as familiar with it, but since you said it, yes, it probably does. :)

minrk · 2015-09-18T20:49:17Z

While there are still lots of fun technical things to discuss as we move along, I'll formally express my +1 on the proposal.

ellisonbg · 2015-09-18T20:50:04Z

kernel_gateway/proposal.md

+
+## Audience
+
+* Jupyter Notebook users who want to run their notebook server remote from their kernel compute cluster


This is a really important usage case, awesome!

ellisonbg · 2015-09-18T20:58:09Z

Overall, I think this is a super important proposal that I heartily support.

Given the the existing notebook server already has:

A 0mq->websocket adapter layer for the kernel's messge spec
A REST API for starting kernels and sessions

It would be helpful to describe 1) what the proposal will add to that (multitenancy, auth, etc) and 2) how the existing stuff will be reused. In particular, if the core of the REST API and websocket stuff is the same, it would be great to have a single code base to maintain for that stuff.

Maybe the right solution is to even transition the single user notebook server over to using this API directly.

It sounds like you are thinking in these terms given the idea to reuse the JS client side of this stuff (jupyter-js-services) - just good to clarify how this stuff will interplay with the existing stuff at the level of code and APIs.

Another question is how this stuff will interact with jupyterhub. Jupyterhub is getting a ton of usage and it would be very helpful in the long run to separate the serving of notebook from the kernels for jupyterhub as well.

damianavila · 2015-09-19T00:07:54Z

A lot of nice discussions here... but let's go to the proposal acceptance: more than 👍

Carreau · 2015-09-20T00:00:33Z

Can we +1 on principle and refine the exact technical details once it is accepted ?

fperez · 2015-09-20T00:14:44Z

While there may be technical details to be worked out, that's the point of a project evolving in incubation...

On the principle of this project, I am actually very excited, so count me enthusiastically in. Thanks!!!

parente · 2015-09-20T01:54:12Z

It sounds like you are thinking in these terms given the idea to reuse the JS client side of this stuff (jupyter-js-services) - just good to clarify how this stuff will interplay with the existing stuff at the level of code and APIs.

I don't have an answer on how the interplay will work quite yet, but I agree with you that these unknowns are worth calling out in the proposal. They're the reason for doing these investigations in an incubator. Likewise, I agree that listing what additional features we're looking to add above and beyond whatever currently exists is worth noting, even if the exact implementation is not yet known (e.g., API auth, multihost kernel scaling).

I'll update the proposal with these edits soon.

parente · 2015-09-20T02:00:45Z

@freeman-lab thanks for the clarification up above. Good to hear that some pieces of this proposal sound useful for Binder. I figure we can iron out the details of what holds value for Binder to launch and/or if there's a way to agree on a set of common APIs for launching things once this incubator exists.

ellisonbg · 2015-09-20T17:58:02Z

I know it is a technical detail, but I think that it might hurt the
incubators chance of being incorporated into the project if it uses Go:

Much of the logic already exists in our current python code base and
could easily be reused to reduce the burden of having to maintain two
versions. Our entire deployment architecture isn't going to move away from
python anytime soon.
Choosing Go limits who can work on it as no one in our existing core
team knows Go very well (that I know of).

These factors should not in any way prevent the incubation proposal from
moving forward or there being experimentation with Go. I do also understand
that it would be advantageous to not have to install Python in all of the
kernel containers though so it also might help incorporation if it
dramatically easy the adoption and deployment of this stuff. This benefit
would have to be balanced with the costs.

Just wanted to mention how these implementation details might affect
eventual incorporation.

But again - these are details +1 overall.

On Sat, Sep 19, 2015 at 7:00 PM, Peter Parente notifications@github.com
wrote:

@freeman-lab https://github.com/freeman-lab thanks for the
clarification up above. Good to hear that some pieces of this proposal
sound useful for Binder. I figure we can iron out the details of what holds
value for Binder to launch and/or if there's a way to agree on a set of
common APIs for launching things once this incubator exists.

—
Reply to this email directly or view it on GitHub
#3 (comment)
.

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

rgbkrk · 2015-09-20T19:30:46Z

Choosing Go limits who can work on it as no one in our existing core
team knows Go very well (that I know of).

Actually, that would be me now. To be fair to @parente and team, I suggested Go for one big reason. Adding a static binary to a pre-configured Docker image and launching it is trivial compared to trying to install a Python dependency into an unknown system. Another approach is to package up a special virtualenv or conda environment into an image with our own pathing.

Going forward, if wrapping the existing Handlers is really all we need, it's not hard to package it up. It's the configuration, security and scaling issues on the outside that are going to be more difficult.

As if it weren't obvious, I'm 👍 to iterating on this within a repository 😄.

ellisonbg · 2015-09-21T15:31:23Z

We would like to declare consensus and accept this proposal. Congrats! We will create a repo here shortly and add everyone to it.

Proposal for kernel provisioning and gateway investigations

Proposal for kernel provisioning and gateway investigations

18322ff

minrk reviewed Sep 16, 2015
View reviewed changes

rgbkrk mentioned this pull request Sep 16, 2015

Incubation proposals for content management extensions, widget extensions and dynamic dashboards. #4

Merged

ellisonbg reviewed Sep 18, 2015
View reviewed changes

ellisonbg added a commit that referenced this pull request Sep 21, 2015

Merge pull request #3 from parente/kernel-gateway

628f0ac

Proposal for kernel provisioning and gateway investigations

ellisonbg merged commit 628f0ac into jupyter-incubator:master Sep 21, 2015

parente mentioned this pull request Feb 12, 2016

jupyter-incubator/kernel_gateway incorporation proposal jupyter/enhancement-proposals#12

Merged

parente mentioned this pull request Oct 5, 2017

Jupyter Enterprise Gateway proposal #11

Merged


		1. A _client_ (e.g., Thebe) that includes JavaScript code from Jupyter Notebook to request and communicate with kernels

		2. A _spawner_ (e.g., tmpnb) that provisions gateway servers to handle client kernel requests


		## Audience

		* Jupyter Notebook users who want to run their notebook server remote from their kernel compute cluster

Proposal for kernel provisioning and gateway investigations #3

Proposal for kernel provisioning and gateway investigations #3

Conversation

parente commented Sep 15, 2015

parente commented Sep 15, 2015

sccolbert commented Sep 15, 2015

blink1073 commented Sep 15, 2015

rgbkrk commented Sep 15, 2015

parente commented Sep 15, 2015

blink1073 commented Sep 15, 2015

jasongrout commented Sep 15, 2015

parente commented Sep 15, 2015

rgbkrk commented Sep 15, 2015

jasongrout commented Sep 15, 2015

rgbkrk commented Sep 15, 2015

jasongrout commented Sep 15, 2015

jasongrout commented Sep 15, 2015

ahmadia commented Sep 15, 2015

minrk commented Sep 16, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

freeman-lab commented Sep 16, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minrk commented Sep 18, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ellisonbg commented Sep 18, 2015

damianavila commented Sep 19, 2015

Carreau commented Sep 20, 2015

fperez commented Sep 20, 2015

parente commented Sep 20, 2015

parente commented Sep 20, 2015

ellisonbg commented Sep 20, 2015

rgbkrk commented Sep 20, 2015

ellisonbg commented Sep 21, 2015