Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't clear shard allocation excludes at every reconciliation #1522

Closed
sebgl opened this issue Aug 8, 2019 · 3 comments · Fixed by #2610
Closed

Don't clear shard allocation excludes at every reconciliation #1522

sebgl opened this issue Aug 8, 2019 · 3 comments · Fixed by #2610
Assignees
Labels
>enhancement Enhancement of existing functionality

Comments

@sebgl
Copy link
Contributor

sebgl commented Aug 8, 2019

We clear shard allocation excludes at every reconciliation attempt, to make sure we correctly reset it after nodes are downscaled.
We could optimize by only clearing it if not already cleared, similar to what we do with zen1 minimum_master_nodes:

// Check if we really need to update minimum_master_nodes with an API call
if minimumMasterNodes == reconcileState.GetZen1MinimumMasterNodes() {
return false, nil
}

Related: #1161.

@sebgl sebgl added the >enhancement Enhancement of existing functionality label Aug 8, 2019
@ximenzaoshi
Copy link

It's very strange to clear shard allocation excludes at every reconciliation attempt. I tried to decommission one specific node by setting shard allocation exclude config, but it always be reseted automatically which really confused me. I think we should not change the settings that already exist.
By the way, is there any way to remove a specific node? It seems that we can not do this using statefulset.

@sebgl
Copy link
Contributor Author

sebgl commented Oct 23, 2019

@ximenzaoshi a workaround to not have the operator concurrently resetting your cluster settings is to temporarily disable reconciliations. This can be done by setting an annotation on your Elasticsearch resource:
"common.k8s.elastic.co/pause": "true"
When set, ECK will ignore the Elasticsearch cluster. Don't forget to remove it when you're done with any temporary manual operation.

By the way, is there any way to remove a specific node? It seems that we can not do this using statefulset.

Indeed, it's not straightforward. Can you explain a bit more your use case?
Depending on what you are trying to achieve, you could:

  • downscale the NodeSet (eg. 2 nodes instead of 3)
  • drain the corresponding Kubernetes node, which will trigger a rescheduling of the Pods that are on it. Provided your Elasticsearch cluster is in a green health, Elasticsearch Pods should be safely removed one by one on that Kubernetes node.
  • cordon the corresponding node to make it unschedulable, then apply a modification on the corresponding NodeSet (eg. an annotation or label), so that all Pods of this NodeSet get rotated. Since the k8s node is unschedulable the Pod you wanted to move should be scheduled on another k8s node. Unless affinity or PV constraints prevent it.

@ximenzaoshi
Copy link

ximenzaoshi commented Oct 23, 2019

Thanks for your reply! We have two ES node on one host and we want to move one of them away, as the host load is high. I can have a try with your method, thanks.
Unless affinity or PV constraints prevent it.
Yes..we use local storage which makes the operation more difficult. --

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants