Voting Only Master Node #14340

pickypg · 2015-10-28T16:50:01Z

The current ideal setup for master nodes is to have 3 master nodes, thus giving high availability (HA) with a quorum of 2 master nodes at any given time.

However, in some cloud regions (and more frequently in internal data centers), this creates a problem because there may only be two available zones to place your nodes. In these scenarios, you do not want to give up HA just to maintain 3 master nodes (to be clear: it's better than not doing it, but it's less than ideal) because it means that you have a lopsided HA environment where one "half" cannot actually survive without the other.

The only real solution to this problem is to have a third zone, even if that third zone is a slightly higher latency (across the globe is not going to work), then plunk a master node into that zone to give a single master node per zone -- thus enabling the actual survivability of any single zone, but not of two zones.

Now, you could do this today with a master node and things would be generally okay except that the master node with the added latency may become the elected master node, which is obviously not ideal.

This is where a voting-only master node improves this setup significantly. If you have a master node that can itself not be elected, but that participates in elections only, then you can have a tiny node (careful with VM size due to automatic network throttling in cloud ecosystems!) that help to survive the loss of any single zone without seriously risking performance.

bleskes · 2015-10-28T17:05:01Z

This is definitely on our horizon. We need to do some infra work before we get there though... on it :)

bertrandfalguiere · 2016-06-09T12:48:24Z

I'm new to ES, but won't it be better and maybe technically easier to give "points" to the master eligible nodes? Those points would give priority to the best ranked node to become master.
If zones A and B have high availability, but not zone C, you could give:
"node.master_eligibility" : 3 for your master eligible node A,
"node.master_eligibility" : 2 for your master eligible node B,
"node.master_eligibility" : 1 for your laggy master eligible node C.
If you lose A, B is elected if needed, and vice versa.
It would have have the same effect of avoiding C to become master.

And (new case) if you have a fast A and slow B and C, it means that B and C can take over if the cluster lost A, but as soon as the fast A node is back online, it takes its mastership back.

It probably means changing the election process, but no new node status is needed.

Does it make sense? Again i'm new to all this.

bleskes · 2016-06-09T13:16:44Z

@bertrandfalguiere no worries and welcome to the repo.

If I read you correctly, you want to have some mechanics to influence master election and indicate you prefer one master over another. Putting all the trickiness of a distributed master election aside (it's not easy to add these things correctly), the main thing to realize here that you need to have a good working master at any given moment. Otherwise your cluster is in a big risk. So have a slow master and a fast master doesn't make sense - you need two fast masters etc.

PS. We try to keep github for concrete issues. For more discussion-y things we ask people to go to https://discuss.elastic.co

elasticmachine · 2018-03-27T14:51:07Z

Pinging @elastic/es-distributed

seang-es · 2019-03-29T21:07:20Z

In some cloud use cases we might have a two zone deployment with a third zone containing only a tiebreaker master node. By default, this will be configured with 1 GB RAM, and may not be suitable for larger clusters. If this could be configured as a voting-only node, then it could provide quorum without becoming an active master.

A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>

pickypg added >enhancement :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure labels Oct 28, 2015

clintongormley added the high hanging fruit label Oct 29, 2015

jasontedor mentioned this issue Sep 25, 2016

Make possible to use of priority in election #20655

Closed

DaveCTurner added the :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. label Mar 27, 2018

DaveCTurner removed the :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure label Mar 27, 2018

DaveCTurner mentioned this issue Jul 30, 2018

Support for arbiter/temporary master node #32462

Closed

DaveCTurner mentioned this issue Oct 23, 2018

[Zen2] Update documentation for Zen2 #34714

Merged

ywelsch mentioned this issue Nov 9, 2018

Make node.master a dynamic setting #10793

Closed

vbohata mentioned this issue Dec 31, 2018

Master node election + force master node change api #37036

Closed

DaveCTurner added the team-discuss label Apr 10, 2019

ywelsch mentioned this issue Jun 20, 2019

Add voting-only master node #43410

Merged

ywelsch closed this as completed in #43410 Jun 25, 2019

ywelsch removed the team-discuss label Jul 2, 2019

Mpdreamz mentioned this issue Aug 7, 2019

[meta] 7.3 Release elastic/elasticsearch-net#4001

Closed

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voting Only Master Node #14340

Voting Only Master Node #14340

pickypg commented Oct 28, 2015

bleskes commented Oct 28, 2015

bertrandfalguiere commented Jun 9, 2016

bleskes commented Jun 9, 2016

elasticmachine commented Mar 27, 2018

seang-es commented Mar 29, 2019

Voting Only Master Node #14340

Voting Only Master Node #14340

Comments

pickypg commented Oct 28, 2015

bleskes commented Oct 28, 2015

bertrandfalguiere commented Jun 9, 2016

bleskes commented Jun 9, 2016

elasticmachine commented Mar 27, 2018

seang-es commented Mar 29, 2019