gRPC Load Balancing between http-adapter and things #387

nwest1 · 2018-09-05T14:33:36Z

BUG REPORT

What were you trying to achieve?
Targeted TPS load testing against http-adapter and a failure scenario
What are the expected results?
balanced, self-healing connections between mainflux components
What are the received results?
gRPC not being load balanced, and when a pod is killed, these connections are not rebalanced
What are the steps to reproduce the issue?

Run a load test in kubernetes with multiple replicas (http-adapter is using the things service name as url)
Note that the majority of transactions are running through a single things pod and not load balanced
Kill the things pod that's receiving the majority of transations
Note the error rate spikes and connections from adapter to things remain in error and are not rebalanced.

In what environment did you encounter the issue?
kubernetes
Additional information you deem important:
This might be slightly out of scope, as it's possible to solve this with Istio/Linkerd or some k8s ingress that is aware of gRPC. But want to bring it up as it looks like you're exploring similar replacements for nginx.

The text was updated successfully, but these errors were encountered:

drasko · 2018-09-05T18:44:12Z

@nwest1 thanks for filing this one.

drasko · 2018-09-11T23:41:06Z

@nwest1 we've analyzed this one, and surely there is a missing LB part for gRPC. Problem is that although k8 adds LB in front of the services, gRPC pass via HTTP/2 and is more like a stream and not balanced via L7 LB. For gRPC we need L4 LB, and most often used is Istio approach with Envoy

This is explained well in this video: https://www.youtube.com/watch?v=F2znfxn_5Hg

Additional info:

We already wanted to go this route, please consult issue https://github.com/mainflux/mainflux/issues/352.

Although there is a way to avoid using Envoy and LB gRPC on client side, we feel that Envoy approach will be better.

@janko-isidorovic already started integration of Istio/Envoy in our k8 scripts and we should have something working before end of week.

One more approach that can be taken is using NATS and it's Request-Replay mode with internal NATS LB in Queue mode + new feature of Drain for downscaling: https://medium.com/@derekcollison/nats-resilient-systems-and-drain-mode-a764d4968711. This could eventually make system simpler, but we are not sure would it make easier to debug (i.e. o figure out where the messages come from in the case of error), and we are not sure about perfs of NATS with all these modes enabled. Let's try first Isto/Envoy solution.

nmarcetic · 2018-10-01T10:41:09Z

#378 Closes this issue.

drasko · 2018-10-01T14:24:11Z

@nmarcetic it is Envoy/Istio that will close the issue (https://github.com/mainflux/mainflux/issues/352).

nmarcetic · 2018-10-01T15:39:07Z

@drasko Yes and its related to #378

janko-isidorovic · 2018-10-01T18:24:52Z

@nwest1 we have been able to reproduce the issue in Mainflux Lab. Adding Istio to Kubernets and Istio sidecar to the http-adapter pod resolves the issue and enables load balancing of the gRPC connections.
We need to decide if the solution is in the scope of Mainflux as it is more related to deployment strategy.

drasko · 2018-10-01T18:33:39Z

I agree that this depends only on deployment strategy (k8 in this case) and is not at all Mainflux issue. Let's see what is the best way to approach this, we should stay generic and not impose a specific deployment strategy and we need to understand how does this fall in Mainflux project scope.

anovakovic01 · 2018-11-27T12:04:51Z

Resolved with #378.

drasko assigned drasko, chombium, nmarcetic, manuio, blokovi, ghost , anovakovic01 and dborovcanin Sep 5, 2018

drasko added the bug label Sep 5, 2018

drasko added this to the 0.6.0 milestone Sep 5, 2018

anovakovic01 changed the title ~~gRPC Load Balancing between http-adapter and users~~ gRPC Load Balancing between http-adapter and things Sep 6, 2018

drasko mentioned this issue Sep 11, 2018

Switch to ingress nginx on k8s setup #378

Closed

nmarcetic modified the milestones: 0.6.0, 0.7.0 Oct 1, 2018

ghost removed their assignment Nov 26, 2018

anovakovic01 mentioned this issue Nov 26, 2018

MF-378 - Add nginx ingress config to k8s services #472

Merged

anovakovic01 closed this as completed Nov 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gRPC Load Balancing between http-adapter and things #387

gRPC Load Balancing between http-adapter and things #387

nwest1 commented Sep 5, 2018 •

edited

Loading

drasko commented Sep 5, 2018

drasko commented Sep 11, 2018

nmarcetic commented Oct 1, 2018

drasko commented Oct 1, 2018

nmarcetic commented Oct 1, 2018

janko-isidorovic commented Oct 1, 2018

drasko commented Oct 1, 2018

anovakovic01 commented Nov 27, 2018

gRPC Load Balancing between http-adapter and things #387

gRPC Load Balancing between http-adapter and things #387

Comments

nwest1 commented Sep 5, 2018 • edited Loading

drasko commented Sep 5, 2018

drasko commented Sep 11, 2018

nmarcetic commented Oct 1, 2018

drasko commented Oct 1, 2018

nmarcetic commented Oct 1, 2018

janko-isidorovic commented Oct 1, 2018

drasko commented Oct 1, 2018

anovakovic01 commented Nov 27, 2018

nwest1 commented Sep 5, 2018 •

edited

Loading