Instructions for multi master #311

rmenn · 2017-02-16T08:04:14Z

Greetings,

I have been trying to ask this on irc as well as the k8s slack, but am restoring to a ticket here. I apologize.

I wanted to know if a multi master setup is possible with bootkube, if so how do i do it, especially with the experimental etcd flag set.

Just need someone to point me in the right direction.

Thanks

bzub · 2017-02-16T14:27:59Z

For Master k8s components you should only need to give another node the master=true label.

kubectl label node node2.zbrbdl "master=true"

Then you can either create a LoadBalancer with external IP to the kubernetes service (default namespace) or point external DNS to one or all of the API server nodes for kubectl clients to use.

For self-hosted-etcd you can try the steps in the etcd-operator README. I haven't tried that yet myself.

aaronlevy · 2017-02-17T01:48:12Z

As @bzub points out, simply labeling the node as a master will start master components on these nodes. The main change is that you need some way of addressing your multiple api-servers from a single address. kubeconfig only supports a single api-server address, and even though you can specify multiple on the kubelet command line, only the first is really used.

So a loadbalancer which fronts all api-servers (master nodes), or DNS entry which maps to those nodes are usually good options. You would then set your api server address in the kubeconfig to point to the dns or loadbalancer.

There is also somewhat of a limitation in the internal kubernetes service where the multiple api-servers will all overwrite eachother as the only endpoint. (To see this kubectl get service kubernetes - if you have multiple apiservers running, the endpoint will constantly change).

This isn't the worst thing in the world, but it's not ideal (there is work to resolve this upstream). In the interim, an option is to also use your loadbalancer/dns entry for this endpoint as well, which can be done by setting the apiserver --advertise-address=<your loadbalancer> - so the endpoint will always point to the same location.

bzub · 2017-02-18T03:26:17Z

I'm just starting to implement an automated HA failover system for kube-apiserver with keepalived-vip and @aaronlevy your comment about the default kubernetes service was very enlightening. I really would have overlooked that issue, as limited as it is.

Looking into it further I found that the correct behavior for the kubernetes api service is enabled by editing the kube-apiserver DaemonSet and passing the --apiserver-count=<int> to apiserver, with the correct number of master nodes. Once that's applied, delete each apiserver one at a time to apply the change and add the master label to a non-master node if desired. This way you can then keep --advertise-address=$(POD_IP) the way it is.

It's unfortunate that this isn't mentioned in the primary Kubernetes High-Availability document.

Also, please be warned if you try keepalived-vip's README example that the examples/echoheaders.yaml manifest has an improper ---- separator in the yaml, I had to remove one hyphen so there's only three. I'll file an issue there.

aaronlevy · 2017-02-28T22:58:03Z

@bzub be careful about using the --apiserver-count flag -- its behavior is a little less than desirable (see: kubernetes/kubernetes#22609)

Essentially you're putting a fixed number of endpoints into the kubernetes service, and if those endpoints happen to be down, a certain percentage of requests just fail (because the endpoints are not cleaned up).

klausenbusk · 2017-06-19T10:34:42Z

So a loadbalancer which fronts all api-servers (master nodes), or DNS entry which maps to those nodes are usually good options. You would then set your api server address in the kubeconfig to point to the dns or loadbalancer.

I'm currently using nginx-proxy (pod, template, nginx.conf) from the kubespray (with hard-coded upstreams) in my coreos-kubernetes cluster (I'm properly gonna setup a new 1.6 cluster with bootkube, the current cluster is 1.5.4 iirc).

Maybe we could add that pod to bootkube? We just miss a dynamic config writer which include all the master nodes in the nginx.conf.

klausenbusk · 2017-08-13T19:15:06Z

Proof of concept: #684

omkensey · 2017-11-30T04:29:59Z

Correct me if I'm wrong -- this results in a single nginx pod, yes? Won't that potentially move around the cluster and thus the HA endpoint IP will change? (Actually, if it's just a Pod with no Deployment in front of it, won't it just go down if the current node fails? Pods explicitly do not survive node failures.) I like something like keepalived-vip better or even a full cluster service like Pacemaker managing things so that the IP never changes but follows the LB provider around. Alternately maybe the Pod could be a single-replica Deployment with an init container to handle registering the current IP with DNS.

klausenbusk · 2017-12-01T16:35:29Z

Correct me if I'm wrong -- this results in a single nginx pod, yes?

Are you referring to #684? #684 runs a nginx-prox pod on every node and listen on localhost, you would then connect to the API server through localhost.
I'm using that design (but not #684) right now in a 12 node cluster (3 masters + 7 workers + 2 "vpn" servers), it works pretty well.

anguslees · 2018-12-07T00:09:24Z

I realise this is an ancient bug, but just in case anyone is still reading it:

add your 3 masters to round robin (external) dns, and use that DNS name from outside the cluster (or from net=host pods, importantly including kube-proxy)
configure kube-controller-manager and kube-scheduler to talk to apiserver via localhost, since these jobs always run on master nodes. They could also use the external DNS name, but localhost is simpler.
Add readinessprobes to apiserver manifest and change the default.kubernetes Service to refer to the self-hosted apiserver pods by selector, like a regular internal k8s Service.
use default.kubernetes (via kube-proxy) from inside the cluster as usual

No need for nginx or keepalived, failover is automatic (with at worst a TCP connect retry for external clients) - except for updating the external round-robin DNS record. Just make updating that DNS entry part of your master node replacement process (which is already somewhat special-cased).

fejta-bot · 2019-04-27T09:50:04Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-05-27T10:32:39Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-06-26T11:23:59Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-06-26T11:24:07Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

aaronlevy added the kind/support Categorizes issue or PR as a support question. label Feb 17, 2017

aaronlevy self-assigned this Feb 17, 2017

aaronlevy mentioned this issue Mar 1, 2017

Multiple automated master nodes (e.g. in autoscaling group) #342

Closed

aaronlevy added kind/documentation Categorizes issue or PR as related to documentation. priority/P1 labels Mar 6, 2017

This was referenced Mar 10, 2017

ETCD Fails to start #367

Closed

Support deploying self-hosted etcd #31

Closed

aaronlevy mentioned this issue Jun 19, 2017

Add disaster recovery documentation. #584

Merged

klausenbusk mentioned this issue Aug 13, 2017

hack/quickstart: Add automatic self-hosted multi-master loadbalancer #684

Closed

kahkhang mentioned this issue Jan 6, 2018

HA enhancements kahkhang/kube-linode#66

Open

2 tasks

flowchartsman mentioned this issue Jul 28, 2018

AWS Quickstart: mention TAG_MASTER #996

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 27, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 27, 2019

k8s-ci-robot closed this as completed Jun 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instructions for multi master #311

Instructions for multi master #311

rmenn commented Feb 16, 2017

bzub commented Feb 16, 2017

aaronlevy commented Feb 17, 2017

bzub commented Feb 18, 2017 •

edited

Loading

aaronlevy commented Feb 28, 2017

klausenbusk commented Jun 19, 2017 •

edited

Loading

klausenbusk commented Aug 13, 2017

omkensey commented Nov 30, 2017

klausenbusk commented Dec 1, 2017

anguslees commented Dec 7, 2018 •

edited

Loading

fejta-bot commented Apr 27, 2019

fejta-bot commented May 27, 2019

fejta-bot commented Jun 26, 2019

k8s-ci-robot commented Jun 26, 2019

Instructions for multi master #311

Instructions for multi master #311

Comments

rmenn commented Feb 16, 2017

bzub commented Feb 16, 2017

aaronlevy commented Feb 17, 2017

bzub commented Feb 18, 2017 • edited Loading

aaronlevy commented Feb 28, 2017

klausenbusk commented Jun 19, 2017 • edited Loading

klausenbusk commented Aug 13, 2017

omkensey commented Nov 30, 2017

klausenbusk commented Dec 1, 2017

anguslees commented Dec 7, 2018 • edited Loading

fejta-bot commented Apr 27, 2019

fejta-bot commented May 27, 2019

fejta-bot commented Jun 26, 2019

k8s-ci-robot commented Jun 26, 2019

bzub commented Feb 18, 2017 •

edited

Loading

klausenbusk commented Jun 19, 2017 •

edited

Loading

anguslees commented Dec 7, 2018 •

edited

Loading