-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drain node doesn't work with elasticsearch poddisruptionbudget policy #1824
Comments
Looking at your PDB it seems your are running a recent master build? Is that correct? Can you give us some more information about the topology and state of your Elasticsearch cluster? |
Please let me know if you need any additional info? |
The way the default PDB works is the following:
If you don't care much about cluster unavailability (as in: "it's OK for my cluster to become temporarily unavailable while disruptions happen), you can disable the PDB (set it to I don't really see a way we could improve the default PDB to match your setup, unless we accept some downtime/unavailability (but then what's the point of using a PDB?). In your case where 2 Pods are on the same k8s Node, what the PDB should allow is:
|
@sebgl Thanks for the info. Sure, I'll try with disabling pdb (for testing) to see if it works? Regarding _cluster/health, if cluster state is not green, it will never allow it to delete any pods or even worse, the cluster can't be upgraded as well? |
Yes, because that would be dangerous for data integrity and availability, it's hard for ECK to make such decisions. I guess at some point we could introduce a setting in the Elasticsearch resource that allows a forced upgrade to go through, at your own risks. However, we are working on making sure we move on with rolling upgrades if all nodes are down due to misconfiguration, or if an entire nodeSet is down due to misconfiguration. |
@sebgl Sure. I'll keep eye on other issue. For now, I have modified the test topology to 2 nodeSets, master with count=3 and ingest-data with count=2. I'll try couple of failover cases soon and will update my findings. |
Closing this for now. Please re-open if you have any updates on the issue. |
Proposal
Use case. Why is this important?
For node maintenance and upgrades,
kubectl drain
command is helpful to move all workloads to other nodes.Bug Report
What did you do?
kubectl drain
What did you expect to see?
drain node works fine and able to reschedule all pods from that node to other nodes
What did you see instead? Under which circumstances?
The drain node doesn't successfully exit as it couldn't fulfil the PDB policy for Elasticseach.
Currently, there are 2 ES pods running on node A where the drain is executed:
es-1node-es-testgroup1-2
es-1node-es-testgroup1-3
Here is the pdb policy:
The text was updated successfully, but these errors were encountered: