-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit pod node anti-affinity rules #197
Comments
Also, should this include zone anti-affinty? Something like
|
The docs still warn against using them. (I do realize that k8s docs are not always updated proactively)
Do you know which PR or commit(s) fixed it? I searched closed k/k PRs, and found one from April 2019, but that doesn't line up with the 1.13 release date (mentioned in the slack channel). |
Scouring change logs and git history... Kubernetes change logs suggest 1.11 and 1.12 had work done for anti-affinity scheduler performance - it looks like 1.11 would have been the bigger. https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#sig-scheduling-1
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#sig-scheduling-1
That note was added back in Oct 2017. I think that's before the original kube-dns anti-affinity change in Dec 2017: And the 1.11 change ( #62211) dates to Apr 2013 - after anti-affinity had been pulled out. I'll ask sig-scheduling if that warning is obsolete. |
Thanks @jmcmeek ! Not sure about the 50 weights ... or how to mix zone and node anti-affinity correctly.
|
Checked with sig-scheduling about that warning: at https://kubernetes.slack.com/archives/C09TP78DV/p1568225168053800
|
OK Thanks, we can add it and see if it passes the 5k k8s scale tests. We can add it here now, but as for merging into k8s (e.g. kubeadm/kubeup) 1.16 is already in code freeze, so it wont actually get scale tested at 5k nodes until later on in the k8s 1.17 release. |
@chrisohaver I did limited testing with the snippet I posted. It "worked for me".
Note: Pod anti-affinity requires nodes to be consistently labelled, i.e. every node in the cluster must have an appropriate label matching topologyKey. If some or all nodes are missing the specified topologyKey label, it can lead to unintended behavior. I'll have to checkout the empty topologyKey idea. It sounds promising. |
On the surface, empty topologyKey does not seem to be allowed. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.15/#podaffinityterm-v1-core topologyKey string This pod should be co-located (affinity) or not co-located (anti-affinity) with the pods matching the labelSelector in the specified namespaces, where co-located is defined as running on a node whose value of the label with key topologyKey matches that of any node on which any of the selected pods is running. Empty topologyKey is not allowed. I found this in Kubernetes 1.8 changelog suggesting the note about using an empty topologyKey is old.
|
@chrisohaver Thank you! |
pod anti-affinity rules were revoked in #60 due to Kubernetes scheduler performance concerns.
Scheduler performance enhancements have been made since then and I think it is safe to use those now. See my question and reply on #sig-scheduler Slack channel
Q: https://kubernetes.slack.com/archives/C09TP78DV/p1568127000040100
R: https://kubernetes.slack.com/archives/C09TP78DV/p1568128953041900
The text was updated successfully, but these errors were encountered: