-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node labels are lost after Deregister/Register Node events #19899
Comments
@openshift/sig-pod |
@skynardo how did you set the labels nodes in first place ? |
We ran the playbook below after upgrading to version 3.9 to deploy openshift logging. These playbooks must set the logging-infra-fluentd=true on all nodes. We ran the upgrade playbook below which sets the role.kubernetes.io/compute=true /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade.yml We also have group_var files to set Openshift Roles for nodes, and masters (shown below). cat tag_OpenShift_Role_nodeopenshift_schedulable: true cat tag_OpenShift_Role_masteropenshift_schedulable: true |
This is a known issue: Upstream discussions: This cause of the label loss is that the cloud node controller is deleting the node when the instance that backs it is shutdown. |
Automatic merge from submit-queue (batch tested with PRs 67571, 67284, 66835, 68096, 68152). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. cloudprovider: aws: return true on existence check for stopped instances xref https://bugzilla.redhat.com/show_bug.cgi?id=1559271 xref openshift/origin#19899 background kubernetes#45986 (comment) Basically our customers are hitting this issue where the Node resource is deleted when the AWS instances stop (not terminate). If the instances restart, the Nodes lose any labeling/taints. Openstack cloudprovider already made this change kubernetes#59931 fixes kubernetes#45118 for AWS **Reviewer note**: valid AWS instance states are `pending | running | shutting-down | terminated | stopping | stopped`. There might be a case for returning `false` for instances in `pending` and/or `terminated` state. Discuss! `InstanceID()` changes from kubernetes#45986 credit @rrati @derekwaynecarr @smarterclayton @liggitt @justinsb @jsafrane @countspongebob
After stopping/starting a Node (AWS EC2 instance) node labels:
logging-infra-fluentd=true,node-role.kubernetes.io/compute=true
Are missing after the node registers and becomes READY
Version
openshift v3.9.0+ba7faec-1
kubernetes v1.9.1+a0ce1bc657
Steps To Reproduce
NAME STATUS ROLES AGE VERSION LABELS
ip-177-77-77-100.ec2.internal Ready compute 6d v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1a,kubernetes.io/hostname=ip-177-77-77-100.ec2.internal,logging-infra-fluentd=true,node-role.kubernetes.io/compute=true,region=primary,zone=default
Current Result
oc get node ip-177-77-77-100.ec2.internal --show-labels
NAME STATUS ROLES AGE VERSION LABELS
ip-177-77-77-100.ec2.internal Ready 1m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=m4.xlarge,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1a,kubernetes.io/hostname=ip-177-77-77-100.ec2.internal,region=primary,zone=default
Expected Result
We expect to still have the labels logging-infra-fluentd=true,node-role.kubernetes.io/compute=true after reboot.
Additional Information
The text was updated successfully, but these errors were encountered: