Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Communication Protocol #14

Open
saulshanabrook opened this issue Oct 17, 2019 · 5 comments
Open

Improve Communication Protocol #14

saulshanabrook opened this issue Oct 17, 2019 · 5 comments
Labels
ibis-vega The project this belongs to type:enhancement Implement an improvement over a functionality

Comments

@saulshanabrook
Copy link
Contributor

Currently, this extension uses Jupyter comms, which go over websocket, to send queries and data as the visualization is executing.

This "works," but it would be useful to make it easier to profile, debug, and switch out of Jupyter.

Profiling

For example, many queries are very slow. We need to be able to diagnose why that is. Is it data parsing and serialization? SQL response time? Ibis query creation?

To answer these properly, we should implement some form of "distributed tracing." Here is a New Relic UI that visualizes a trace of a request:

There is a W3 group "Distributed Tracing Working Group". They are working on a "Trace Context" spec.

Jaeger is a Cloud Native Computing Foundation project that exists today that says it will support this spec in the future:

My proposal is to try deploying Jaeger next to JupyterLab as a server extension and creating a frontend extension to display it's UI in JupyterLab. That way, we can visualize the queries we are executing as we interact with the graphs and inspect their performance. We will also have to instrument our Python library and pass certain tokens along with the request/responses to keep each request/response together.

Protocol Agnostic

The second issue here is that currently this mime renderer is very tied to being run in JupyterLab. We have to add a special case for running it in Phoilla so that it can access comms: vidartf/phoila#7

Over the past week, the idea of creating a mime render that needs to speak to a kernel but which you deploy outside of Jupyter has come up multiple times (cc @dharhas). To do this, we need to layer some standards on top of our current approach. A mime render on the client side should be a JS library that takes in a handle to a bidirectional async channel to the kernel. And on the server it should output some mimetype and also have a handle on this bidirectional communication channel.

The idea here is that you prototype in a jupyter notebook, but then you can extract out that cell from a notebook and run it on a non jupyter server, possibly embeded in some larger web app, and we should be able to run a non jupyter Python process on the backend. And hook up them up to each other to have bidrectional communication.

Honestly, I am not sure what we should use here. The requirements would ideally be:

  • Able to run over arbitrary transport protocol if you build your own backend. For example, we probably want to be able to run it through Jupyter's comms or over a REST API or a native websocket connection.
  • Be able to transmit raw bytes for efficiency when we need it, but besides that be agnostic to the payload
  • (optional) be able to integrate it with our tracing framework, so we can get some tracing out of the box.

gRPC seems like a good contender here, since it is also a Cloud Native Computing Foundation project and has good adoption. It's web story is just emerging, but there is at least one client with websocket support.

I don't know how Panel fits in here. I imagine they have implemented something here.

@saulshanabrook
Copy link
Contributor Author

It looks like jaeger doesn't yet have a client side API: jaegertracing/jaeger-client-node#109 jaegertracing/jaeger#723 It's being developed here: https://github.com/jaegertracing/jaeger-client-javascript

As a workaround, I suppose we can setup a proxy server that is also running which we can hit from the frontend to send to the open tracing server.

@saulshanabrook
Copy link
Contributor Author

I am experimenting with Jaeger and I notice currently I guess it isn't set up to show spans that haven't finished (jaegertracing/jaeger#729). This is rather too bad, because it would be nice to see a debugging view for a chart as you are interacting with it.

But maybe I should just make each interaction into separate spans. So there will be one initial span for setting it up, then another for each UI update.

@vidartf
Copy link

vidartf commented Oct 21, 2019

we should be able to run a non jupyter Python process on the backend. And hook up them up to each other to have bidrectional communication.

This sounds like you are going to reinvent the jupyter kernels + messaging protocol 😅 Why would it need to be non-jupyter?

@saulshanabrook
Copy link
Contributor Author

saulshanabrook commented Oct 21, 2019

@vidartf b/c jupyter is heavyweight! You might wanna back this by a simple flask server over REST or a websockets server. Not connected to a kernel, just backed by a regular python process.

It's possible we wanna re-use the jupyter comms spec and just create other backends for it, to allow it to be run without running the jupyter server. Or its possible we wanna back it another spec like gRPC and have jupyter comms be a backend for that.

@vidartf
Copy link

vidartf commented Oct 22, 2019

Not connected to a kernel, just backed by a regular python process.

I'm pretty sure this is how ipython started though 😉 More seriously, it would be interesting to hear which features would explicitly be included/excluded compared to the full jupyter_client + ipykernel + kernel manager/handlers case.

@goanpeca goanpeca added ibis-vega The project this belongs to type:enhancement Implement an improvement over a functionality status:backlog Work to be done labels Jan 22, 2020
@rpekrul rpekrul added omniscidb and removed omniscidb status:backlog Work to be done labels Sep 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ibis-vega The project this belongs to type:enhancement Implement an improvement over a functionality
Projects
None yet
Development

No branches or pull requests

4 participants