You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What keywords did you search in NGINX Ingress controller issues before filing this one?
(If you have found any duplicates, you should instead reply there.): "Ingress.Status wrong" "Ingress.Status got often updated"
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
NGINX Ingress controller version:
nginx-0.16.2, but it's in master as well
Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Environment:
Cloud provider or hardware configuration: OpenStack
OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
What happened:
When we run ingress-controller on 4 nodes and create around 4-6 ingress resources,
one of the ingress resource(always same resource)'s Ingress.Status.LoadBalancer.Ingress filed got very often updated(3 times in 2min) although all ingress-controller pods are active and nothing changed.
Even some time ingress controller updated status with wrong value like duplicated hostname.
What you expected to happen:
As long as all ingress controller pods are active and nothing changed to node, ingress resource's
.Status.LoadBalancer.Ingress filed won't be updated.
How to reproduce it (as minimally and precisely as possible):
run ingress-controller on bunch of nodes (like 15 nodes)
physical machine with many core is better to reproduce
create many ingress resource (like 20 ingress)
periodically check kubectl get event, you will see many Update on one(or two) of the ingress resource got often updated.
Anything else we need to know:
Actually I investigated this problem and found the place which is very suspicious and fix it in our environment. Although I create PR, let me explain what is happening.
basically this is happening because ingress-controller update wrong information in Ingress.Status.LoadBalancer.Ingress over and over.
Here is the code to update status https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L149-L153
So basically we should not manipulate original status information's slice reference in a function to be done in parallel, or if still we need to manipulate original reference, we should use deepcopy before pass it
The text was updated successfully, but these errors were encountered:
Currently ingress controller try to update status for each ingress
resource in a parallel by using Goroutine, and inside this Goroutine we
are trying to sort same IngressStatus reference which is shared between
all Goroutine, this will break the original refrence if some Goroutine
tried to sort exact same time.
So we should have done sorting before passing reference to each
Goroutine to prevent from breaking original reference
fixes: kubernetes#3269
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):
No
What keywords did you search in NGINX Ingress controller issues before filing this one?
(If you have found any duplicates, you should instead reply there.): "Ingress.Status wrong" "Ingress.Status got often updated"
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
NGINX Ingress controller version:
nginx-0.16.2, but it's in master as well
Kubernetes version (use
kubectl version
):Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Environment:
Cloud provider or hardware configuration: OpenStack
OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Kernel (e.g.
uname -a
): 3.10.0-862.9.1.el7.x86_64Install tools: RKE (https://github.com/rancher/rke)
What happened:
When we run ingress-controller on 4 nodes and create around 4-6 ingress resources,
one of the ingress resource(always same resource)'s Ingress.Status.LoadBalancer.Ingress filed got very often updated(3 times in 2min) although all ingress-controller pods are active and nothing changed.
Even some time ingress controller updated status with wrong value like duplicated hostname.
What you expected to happen:
As long as all ingress controller pods are active and nothing changed to node, ingress resource's
.Status.LoadBalancer.Ingress filed won't be updated.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know:
Actually I investigated this problem and found the place which is very suspicious and fix it in our environment. Although I create PR, let me explain what is happening.
basically this is happening because ingress-controller update wrong information in Ingress.Status.LoadBalancer.Ingress over and over.
Here is the code to update status
https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L149-L153
In this logic, as a first step, try to get all node ip/hostname running ingress-controller (this seems fine),
And then update status of each ingress resource (https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L326).
Updating each ingress resource will be done in parallel. here have the problem actually we passed original status information's slice reference to each updating function and multiple Goroutine try to sort same status information's slice. This will break the slice and then some times update status with wrong value
https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L335
https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L349
So basically we should not manipulate original status information's slice reference in a function to be done in parallel, or if still we need to manipulate original reference, we should use deepcopy before pass it
The text was updated successfully, but these errors were encountered: