drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

sarjeet2013 · 2019-10-01T00:42:50Z

Proposal

Use case. Why is this important?

For node maintenance and upgrades, kubectl drain command is helpful to move all workloads to other nodes.

Bug Report

What did you do?

kubectl drain

What did you expect to see?

drain node works fine and able to reschedule all pods from that node to other nodes

What did you see instead? Under which circumstances?

The drain node doesn't successfully exit as it couldn't fulfil the PDB policy for Elasticseach.

Logs:

Currently, there are 2 ES pods running on node A where the drain is executed:

es-1node-es-testgroup1-2
es-1node-es-testgroup1-3

error when evicting pod "es-1node-es-testgroup1-3" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod "es-1node-es-testgroup1-2"
error when evicting pod "es-1node-es-testgroup1-2" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod "es-1node-es-testgroup1-3"
error when evicting pod "es-1node-es-testgroup1-3" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Here is the pdb policy:

➜  ~ kubectl get poddisruptionbudgets.policy
NAME                  MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
es-1node-es-default   4               N/A               0                     4h54m

The text was updated successfully, but these errors were encountered:

pebrc · 2019-10-01T07:54:22Z

Looking at your PDB it seems your are running a recent master build? Is that correct?

Can you give us some more information about the topology and state of your Elasticsearch cluster?
How many master nodes?
How many data nodes?
How many ingest nodes?
What does _cat/health return?

sarjeet2013 · 2019-10-01T17:16:19Z

@pebrc

Yes, I built an image from master branch recently.
I am using this as testing template, so all 4 replicas are part of same nodeGroup and are master.

 config:
      node.data: true
      node.ingest: true
      node.master: true
    name: testgroup1
    nodeCount: 4

same as above
same as above
You mean _cluster/health? Here is the curl output for cluster health:

➜  ~ curl -u elastic:xxxxx -s -N https://x.x.x.x:9200/_cluster/health -k | jq

{
  "error": {
    "root_cause": [
      {
        "type": "master_not_discovered_exception",
        "reason": null
      }
    ],
    "type": "master_not_discovered_exception",
    "reason": null
  },
  "status": 503
}

Please let me know if you need any additional info?

sebgl · 2019-10-02T06:58:18Z

The way the default PDB works is the following:

If your cluster is green (from /_cluster/health), allow one Pod (only one) to be removed. Which will probably turn your cluster to yellow.
Else, don't allow any Pod to be disrupted. Otherwise, there's a chance we loose some shard availability, and the cluster will become either red or unresponsive.

If you don't care much about cluster unavailability (as in: "it's OK for my cluster to become temporarily unavailable while disruptions happen), you can disable the PDB (set it to {} in the spec).

I don't really see a way we could improve the default PDB to match your setup, unless we accept some downtime/unavailability (but then what's the point of using a PDB?).

In your case where 2 Pods are on the same k8s Node, what the PDB should allow is:

if your cluster is green, remove one Pod on that node (there is the other one remaining on the same node)
the cluster is temporarily yellow, and missing a Pod
that Pod should be rescheduled on another k8s node, and your ES cluster should become green again
at that point the second Pod can also be removed, so your node is completely drained

sarjeet2013 · 2019-10-02T17:07:42Z

@sebgl Thanks for the info. Sure, I'll try with disabling pdb (for testing) to see if it works?

Regarding _cluster/health, if cluster state is not green, it will never allow it to delete any pods or even worse, the cluster can't be upgraded as well?

sebgl · 2019-10-04T07:01:00Z

Regarding _cluster/health, if cluster state is not green, it will never allow it to delete any pods or even worse, the cluster can't be upgraded as well?

Yes, because that would be dangerous for data integrity and availability, it's hard for ECK to make such decisions.

I guess at some point we could introduce a setting in the Elasticsearch resource that allows a forced upgrade to go through, at your own risks.

However, we are working on making sure we move on with rolling upgrades if all nodes are down due to misconfiguration, or if an entire nodeSet is down due to misconfiguration.

sarjeet2013 · 2019-10-04T18:26:00Z

@sebgl Sure. I'll keep eye on other issue. For now, I have modified the test topology to 2 nodeSets, master with count=3 and ingest-data with count=2.

I'll try couple of failover cases soon and will update my findings.

charith-elastic · 2019-10-21T08:45:48Z

Closing this for now. Please re-open if you have any updates on the issue.

charith-elastic closed this as completed Oct 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

sarjeet2013 commented Oct 1, 2019

pebrc commented Oct 1, 2019

sarjeet2013 commented Oct 1, 2019

sebgl commented Oct 2, 2019 •

edited

Loading

sarjeet2013 commented Oct 2, 2019

sebgl commented Oct 4, 2019 •

edited

Loading

sarjeet2013 commented Oct 4, 2019

charith-elastic commented Oct 21, 2019

drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

Comments

sarjeet2013 commented Oct 1, 2019

Proposal

Bug Report

pebrc commented Oct 1, 2019

sarjeet2013 commented Oct 1, 2019

sebgl commented Oct 2, 2019 • edited Loading

sarjeet2013 commented Oct 2, 2019

sebgl commented Oct 4, 2019 • edited Loading

sarjeet2013 commented Oct 4, 2019

charith-elastic commented Oct 21, 2019

sebgl commented Oct 2, 2019 •

edited

Loading

sebgl commented Oct 4, 2019 •

edited

Loading