Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drain node doesn't work with elasticsearch poddisruptionbudget policy #1824

Closed
sarjeet2013 opened this issue Oct 1, 2019 · 7 comments
Closed

Comments

@sarjeet2013
Copy link

Proposal

Use case. Why is this important?

For node maintenance and upgrades, kubectl drain command is helpful to move all workloads to other nodes.

Bug Report

What did you do?

kubectl drain

What did you expect to see?

drain node works fine and able to reschedule all pods from that node to other nodes

What did you see instead? Under which circumstances?

The drain node doesn't successfully exit as it couldn't fulfil the PDB policy for Elasticseach.

  • Logs:

Currently, there are 2 ES pods running on node A where the drain is executed:

es-1node-es-testgroup1-2
es-1node-es-testgroup1-3

error when evicting pod "es-1node-es-testgroup1-3" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod "es-1node-es-testgroup1-2"
error when evicting pod "es-1node-es-testgroup1-2" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod "es-1node-es-testgroup1-3"
error when evicting pod "es-1node-es-testgroup1-3" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Here is the pdb policy:

➜  ~ kubectl get poddisruptionbudgets.policy
NAME                  MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
es-1node-es-default   4               N/A               0                     4h54m
@pebrc
Copy link
Collaborator

pebrc commented Oct 1, 2019

Looking at your PDB it seems your are running a recent master build? Is that correct?

Can you give us some more information about the topology and state of your Elasticsearch cluster?
How many master nodes?
How many data nodes?
How many ingest nodes?
What does _cat/health return?

@sarjeet2013
Copy link
Author

@pebrc

  1. Yes, I built an image from master branch recently.
  2. I am using this as testing template, so all 4 replicas are part of same nodeGroup and are master.
 config:
      node.data: true
      node.ingest: true
      node.master: true
    name: testgroup1
    nodeCount: 4
  1. same as above
  2. same as above
  3. You mean _cluster/health? Here is the curl output for cluster health:
➜  ~ curl -u elastic:xxxxx -s -N https://x.x.x.x:9200/_cluster/health -k | jq

{
  "error": {
    "root_cause": [
      {
        "type": "master_not_discovered_exception",
        "reason": null
      }
    ],
    "type": "master_not_discovered_exception",
    "reason": null
  },
  "status": 503
}

Please let me know if you need any additional info?

@sebgl
Copy link
Contributor

sebgl commented Oct 2, 2019

The way the default PDB works is the following:

  • If your cluster is green (from /_cluster/health), allow one Pod (only one) to be removed. Which will probably turn your cluster to yellow.
  • Else, don't allow any Pod to be disrupted. Otherwise, there's a chance we loose some shard availability, and the cluster will become either red or unresponsive.

If you don't care much about cluster unavailability (as in: "it's OK for my cluster to become temporarily unavailable while disruptions happen), you can disable the PDB (set it to {} in the spec).

I don't really see a way we could improve the default PDB to match your setup, unless we accept some downtime/unavailability (but then what's the point of using a PDB?).

In your case where 2 Pods are on the same k8s Node, what the PDB should allow is:

  • if your cluster is green, remove one Pod on that node (there is the other one remaining on the same node)
  • the cluster is temporarily yellow, and missing a Pod
  • that Pod should be rescheduled on another k8s node, and your ES cluster should become green again
  • at that point the second Pod can also be removed, so your node is completely drained

@sarjeet2013
Copy link
Author

@sebgl Thanks for the info. Sure, I'll try with disabling pdb (for testing) to see if it works?

Regarding _cluster/health, if cluster state is not green, it will never allow it to delete any pods or even worse, the cluster can't be upgraded as well?

@sebgl
Copy link
Contributor

sebgl commented Oct 4, 2019

Regarding _cluster/health, if cluster state is not green, it will never allow it to delete any pods or even worse, the cluster can't be upgraded as well?

Yes, because that would be dangerous for data integrity and availability, it's hard for ECK to make such decisions.

I guess at some point we could introduce a setting in the Elasticsearch resource that allows a forced upgrade to go through, at your own risks.

However, we are working on making sure we move on with rolling upgrades if all nodes are down due to misconfiguration, or if an entire nodeSet is down due to misconfiguration.

@sarjeet2013
Copy link
Author

@sebgl Sure. I'll keep eye on other issue. For now, I have modified the test topology to 2 nodeSets, master with count=3 and ingest-data with count=2.

I'll try couple of failover cases soon and will update my findings.

@charith-elastic
Copy link
Contributor

Closing this for now. Please re-open if you have any updates on the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants