Go wrapper for incomplete runtimes #747

andresmgot · 2018-05-11T13:47:30Z

Issue Ref: None

Description:

This is a proposal for adding a Go wrapper to proxy request to native runtimes. This way the on boarding of new runtimes will be easier since they wouldn't need to handle:

Port customization
Timeout handling
Request logging
Prometheus statistics

This way the Golang wrapper would redirect requests to localhost:8090 in which the native runtime will be listening. The additional delay is insignificant. With this PR the existing Ruby and PHP runtimes support Prometheus statistics and custom ports.

Points of improvement:

The Go and the native daemon are parallel processes. We can fork the native process from the Go routine so we can kill/restart the native daemon in case of errors/timeouts.
The port 8090 is now not eligible. We would need to dynamically choose a free port.

cc/ @deissnerk @murali-reddy

TODOs:

Ready to review
Automated Tests
Docs

murali-reddy · 2018-05-14T07:34:01Z

Its good idea to consolidate all cross-cutting concerns in runtimes into one sinle golang wrapper. We have more clean design and has a way to provide the missing functionality to runtimes.

Just to clarify, you are proposing that there will be a single running native language process just the way it is today, except they dont implement all common functionality?

andresmgot · 2018-05-14T07:52:49Z

Just to clarify, you are proposing that there will be a single running native language process just the way it is today, except they dont implement all common functionality?

There will be at least two processes: one for the Golang wrapper and another one for the native language. Ideally the native process should be forked from the Golang one to be able to handle termination and restarts but that's not implemented yet. To clarify it: We are not using a sidecar container here for the wrapper since we would lose the possibility of managing the native process with Golang code (but we could).

andresmgot · 2018-05-15T15:30:45Z

@murali-reddy any further comments here?

deissnerk · 2018-05-15T16:17:53Z

@andresmgot I haven't had the time to look into the details, but I like the general direction. I could also imagine to have a sidecar instead. Could the sidecar be able to terminate the function by responding (or not responding) to the health check?
Would it be more efficient to use some form of IPC between the two processes?

andresmgot · 2018-05-16T07:51:03Z

I could also imagine to have a sidecar instead. Could the sidecar be able to terminate the function by responding (or not responding) to the health check?

Yes, that's a valid option which architecture would be more native but I don't see a way for the sidecar to kill the container with the native process (unless, as you say, the container become irresponsive and doesn't answer the health check).

Would it be more efficient to use some form of IPC between the two processes?

We could use a unix socket to be a bit more efficient but we can tackle that in a different PR. Also the idea here was to keep having the native web servers as simple as possible, I don't know how difficult can be to change them to use a socket (although I don't think it would be much more complicated).

deissnerk · 2018-05-16T08:25:11Z

@andresmgot

Yes, that's a valid option which architecture would be more native but I don't see a way for the sidecar to kill the container with the native process (unless, as you say, the container become irresponsive and doesn't answer the health check).

I'm not sure about this, but would it be possible to configure the health check for the function container to be answered by the sidecar? As the health check goes to the pod network interface, it does not mean that the container for which the health check is configured, also answers the to the health check. What do you think?

We could use a unix socket to be a bit more efficient but we can tackle that in a different PR. Also the idea here was to keep having the native web servers as simple as possible, I don't know how difficult can be to change them to use a socket (although I don't think it would be much more complicated).

I agree that this is an optimization that can come later. It would be interesting to find out, how big the performance gain would be or if there would be any.

murali-reddy · 2018-05-16T08:38:24Z

@murali-reddy any further comments here?

No. I like the consolidation. Let start simple as you propose, single container with http interface as IPC.

andresmgot · 2018-05-16T08:55:11Z

As the health check goes to the pod network interface, it does not mean that the container for which the health check is configured, also answers the to the health check. What do you think?

AFAIK the container with the health check should answer it. In any case, it would be weird to configure a container a health check pointing to a different container, isn't it?

No. I like the consolidation. Let start simple as you propose, single container with http interface as IPC.

Okay, I will modify it anyway to fork the native process from the Go wrapper so at least termination signals can be properly forwarded.

deissnerk · 2018-05-16T11:31:54Z

AFAIK the container with the health check should answer it. In any case, it would be weird to configure a container a health check pointing to a different container, isn't it?

Well, under normal circumstances it may be weird, but I like the idea of isolating the wrapper from the function and as it is forwarding requests to the function, it also knows about its health. It could internally do something like a health check with the function and would in addition be able to have it killed, if it starts misbehaving.

andresmgot · 2018-05-16T15:38:33Z

It could internally do something like a health check with the function and would in addition be able to have it killed, if it starts misbehaving.

If I am not wrong it is not possible to generally kill a container from a sidecar (unless something like shared process is enabled in the cluster (but it is not the default). Note that the goal is not to terminate a container that doesn't answer the health check since the kubelet will already do so. The goal is to terminate a process if a request is taking more than N seconds. In any case we will be be returning a timeout response and the container will be eventually killed if it becomes unresponsive.

I will merge this PR with the current approach and open a different issue to split the wrapper into a different container.

alexellis · 2018-05-21T13:04:57Z

@andresmgot we've had something similar to this in OpenFaaS from the beginning called the watchdog. I wonder if there's a way we could collaborate between our projects on this?

https://docs.openfaas.com/architecture/watchdog/

The updated of-watchdog marshals over HTTP to provide the same benefits - timeouts, metrics, healthchecks etc. cc @sebgoa

alexellis · 2018-05-21T13:07:01Z

FYI Riff also use a sidecar with HTTP to talk to a function with the option of pipes, HTTP and gRPC. https://github.com/projectriff/riff/tree/master/function-sidecar

I'm not sure whether there is a measurable difference in using gRPC vs HTTP 1.1 with Keep-Alive over loopback. (cc @markfisher )

andresmgot · 2018-05-21T14:00:50Z

Oh, I didn't know that now the watchdog supports an http mode. Indeed it is the same idea. Note that we want to have the possibility to remove the small overhead of adding this go wrapper for runtimes that already supports timeouts, metrics.. etc so we are just adding it for languages like PHP, Ruby or any new ones.

The main problem of using gRPC instead of HTTP is:

It is more complex from a development point of view so it would be more difficult for users to provide new runtimes.
Some languages may not have support for gRPC.
In any case we can still provide an hybrid solution that listens on both HTTP and gRPC endpoints.

Go wrapper for incomplete runtimes

f57b2d1

Andres added 3 commits May 16, 2018 17:54

Merge master

08b96f3

Fork native process from goroutine

cf8cbd2

Add docs

8f8d5c7

andresmgot changed the title ~~[Proposal] Go wrapper for incomplete runtimes~~ Go wrapper for incomplete runtimes May 16, 2018

andresmgot merged commit 472f9f2 into vmware-archive:master May 17, 2018

andresmgot mentioned this pull request May 17, 2018

Move Go proxy to a sidecar container #763

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Go wrapper for incomplete runtimes #747

Go wrapper for incomplete runtimes #747

andresmgot commented May 11, 2018 •

edited

Loading

murali-reddy commented May 14, 2018

andresmgot commented May 14, 2018 •

edited

Loading

andresmgot commented May 15, 2018

deissnerk commented May 15, 2018

andresmgot commented May 16, 2018

deissnerk commented May 16, 2018

murali-reddy commented May 16, 2018

andresmgot commented May 16, 2018

deissnerk commented May 16, 2018

andresmgot commented May 16, 2018

alexellis commented May 21, 2018

alexellis commented May 21, 2018

andresmgot commented May 21, 2018

Go wrapper for incomplete runtimes #747

Go wrapper for incomplete runtimes #747

Conversation

andresmgot commented May 11, 2018 • edited Loading

murali-reddy commented May 14, 2018

andresmgot commented May 14, 2018 • edited Loading

andresmgot commented May 15, 2018

deissnerk commented May 15, 2018

andresmgot commented May 16, 2018

deissnerk commented May 16, 2018

murali-reddy commented May 16, 2018

andresmgot commented May 16, 2018

deissnerk commented May 16, 2018

andresmgot commented May 16, 2018

alexellis commented May 21, 2018

alexellis commented May 21, 2018

andresmgot commented May 21, 2018

andresmgot commented May 11, 2018 •

edited

Loading

andresmgot commented May 14, 2018 •

edited

Loading