Skip to content
This repository has been archived by the owner on Dec 15, 2021. It is now read-only.

Go wrapper for incomplete runtimes #747

Merged
merged 4 commits into from
May 17, 2018

Conversation

andresmgot
Copy link
Contributor

@andresmgot andresmgot commented May 11, 2018

Issue Ref: None

Description:

This is a proposal for adding a Go wrapper to proxy request to native runtimes. This way the on boarding of new runtimes will be easier since they wouldn't need to handle:

  • Port customization
  • Timeout handling
  • Request logging
  • Prometheus statistics

This way the Golang wrapper would redirect requests to localhost:8090 in which the native runtime will be listening. The additional delay is insignificant. With this PR the existing Ruby and PHP runtimes support Prometheus statistics and custom ports.

Points of improvement:

  • The Go and the native daemon are parallel processes. We can fork the native process from the Go routine so we can kill/restart the native daemon in case of errors/timeouts.
  • The port 8090 is now not eligible. We would need to dynamically choose a free port.

cc/ @deissnerk @murali-reddy

TODOs:

  • Ready to review
  • Automated Tests
  • Docs

@murali-reddy
Copy link
Contributor

Its good idea to consolidate all cross-cutting concerns in runtimes into one sinle golang wrapper. We have more clean design and has a way to provide the missing functionality to runtimes.

Just to clarify, you are proposing that there will be a single running native language process just the way it is today, except they dont implement all common functionality?

@andresmgot
Copy link
Contributor Author

andresmgot commented May 14, 2018

Just to clarify, you are proposing that there will be a single running native language process just the way it is today, except they dont implement all common functionality?

There will be at least two processes: one for the Golang wrapper and another one for the native language. Ideally the native process should be forked from the Golang one to be able to handle termination and restarts but that's not implemented yet. To clarify it: We are not using a sidecar container here for the wrapper since we would lose the possibility of managing the native process with Golang code (but we could).

@andresmgot
Copy link
Contributor Author

@murali-reddy any further comments here?

@deissnerk
Copy link
Contributor

@andresmgot I haven't had the time to look into the details, but I like the general direction. I could also imagine to have a sidecar instead. Could the sidecar be able to terminate the function by responding (or not responding) to the health check?
Would it be more efficient to use some form of IPC between the two processes?

@andresmgot
Copy link
Contributor Author

I could also imagine to have a sidecar instead. Could the sidecar be able to terminate the function by responding (or not responding) to the health check?

Yes, that's a valid option which architecture would be more native but I don't see a way for the sidecar to kill the container with the native process (unless, as you say, the container become irresponsive and doesn't answer the health check).

Would it be more efficient to use some form of IPC between the two processes?

We could use a unix socket to be a bit more efficient but we can tackle that in a different PR. Also the idea here was to keep having the native web servers as simple as possible, I don't know how difficult can be to change them to use a socket (although I don't think it would be much more complicated).

@deissnerk
Copy link
Contributor

@andresmgot

Yes, that's a valid option which architecture would be more native but I don't see a way for the sidecar to kill the container with the native process (unless, as you say, the container become irresponsive and doesn't answer the health check).

I'm not sure about this, but would it be possible to configure the health check for the function container to be answered by the sidecar? As the health check goes to the pod network interface, it does not mean that the container for which the health check is configured, also answers the to the health check. What do you think?

We could use a unix socket to be a bit more efficient but we can tackle that in a different PR. Also the idea here was to keep having the native web servers as simple as possible, I don't know how difficult can be to change them to use a socket (although I don't think it would be much more complicated).

I agree that this is an optimization that can come later. It would be interesting to find out, how big the performance gain would be or if there would be any.

@murali-reddy
Copy link
Contributor

@murali-reddy any further comments here?

No. I like the consolidation. Let start simple as you propose, single container with http interface as IPC.

@andresmgot
Copy link
Contributor Author

As the health check goes to the pod network interface, it does not mean that the container for which the health check is configured, also answers the to the health check. What do you think?

AFAIK the container with the health check should answer it. In any case, it would be weird to configure a container a health check pointing to a different container, isn't it?

No. I like the consolidation. Let start simple as you propose, single container with http interface as IPC.

Okay, I will modify it anyway to fork the native process from the Go wrapper so at least termination signals can be properly forwarded.

@deissnerk
Copy link
Contributor

AFAIK the container with the health check should answer it. In any case, it would be weird to configure a container a health check pointing to a different container, isn't it?

Well, under normal circumstances it may be weird, but I like the idea of isolating the wrapper from the function and as it is forwarding requests to the function, it also knows about its health. It could internally do something like a health check with the function and would in addition be able to have it killed, if it starts misbehaving.

@andresmgot
Copy link
Contributor Author

It could internally do something like a health check with the function and would in addition be able to have it killed, if it starts misbehaving.

If I am not wrong it is not possible to generally kill a container from a sidecar (unless something like shared process is enabled in the cluster (but it is not the default). Note that the goal is not to terminate a container that doesn't answer the health check since the kubelet will already do so. The goal is to terminate a process if a request is taking more than N seconds. In any case we will be be returning a timeout response and the container will be eventually killed if it becomes unresponsive.

I will merge this PR with the current approach and open a different issue to split the wrapper into a different container.

@andresmgot andresmgot changed the title [Proposal] Go wrapper for incomplete runtimes Go wrapper for incomplete runtimes May 16, 2018
@andresmgot andresmgot merged commit 472f9f2 into vmware-archive:master May 17, 2018
@alexellis
Copy link

@andresmgot we've had something similar to this in OpenFaaS from the beginning called the watchdog. I wonder if there's a way we could collaborate between our projects on this?

https://docs.openfaas.com/architecture/watchdog/

The updated of-watchdog marshals over HTTP to provide the same benefits - timeouts, metrics, healthchecks etc. cc @sebgoa

@alexellis
Copy link

FYI Riff also use a sidecar with HTTP to talk to a function with the option of pipes, HTTP and gRPC. https://github.com/projectriff/riff/tree/master/function-sidecar

I'm not sure whether there is a measurable difference in using gRPC vs HTTP 1.1 with Keep-Alive over loopback. (cc @markfisher )

@andresmgot
Copy link
Contributor Author

Oh, I didn't know that now the watchdog supports an http mode. Indeed it is the same idea. Note that we want to have the possibility to remove the small overhead of adding this go wrapper for runtimes that already supports timeouts, metrics.. etc so we are just adding it for languages like PHP, Ruby or any new ones.

The main problem of using gRPC instead of HTTP is:

  • It is more complex from a development point of view so it would be more difficult for users to provide new runtimes.
  • Some languages may not have support for gRPC.
    In any case we can still provide an hybrid solution that listens on both HTTP and gRPC endpoints.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants