Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing default backend caused unnecessary reload of nginx (further causing termination of websocket) #4378

Closed
sasavilic opened this issue Jul 29, 2019 · 3 comments

Comments

@sasavilic
Copy link

sasavilic commented Jul 29, 2019

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): No

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): reload


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

NGINX Ingress controller version: 0.25.0 (but also 0.24.0 and 0.24.1)

Kubernetes version (use kubectl version): v1.13.4

Environment:

  • Cloud provider or hardware configuration: In-house hardware. Running inside VM (I don't know exact underlying hardware configuration)
  • OS (e.g. from /etc/os-release): CentOS 7
  • Kernel (e.g. uname -a): Linux TEST-A-APP-001.removed.com 3.10.0-957.21.3.el7.x86_64 Basic structure  #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Manual
  • Others:

What happened:
We expose part of our services publicly and part of them require authentication. We also use sub-domains per service (see report-example.txt). In case of sub domains, we don't define backend for /.

Problem is that such configuration leads to very frequent nginx reload, although ingress resources are fairly constant. This reloads cause websocket to be dropped and that is how we came to it.

We have inspected generated nginx configuration after each reload and the only thing that is always changed in default route for sub-domain (i.e. foo.example.com) generated by nginx-ingress controller. In one case, it would take rewrite directive from one of ingress controllers that use same subdomain and would setup nginx for authentication for route / (i.e. like for ingress foo-example).

Then after reload we would notice that nginx configuration for route / looks differently, and this time it uses redirect from another ingress and has no authentication defined (i.e. like for ingress foo-example-public-api).

This back and forth changes on default backend configuration would normally not be issue for us (since it leads to nowhere), except that it causes nginx to reload. And this lead to websocket termination.

What you expected to happen:
Since ingress resources are not being changed, nginx configuration should remain stable and should not trigger nginx reload.

How to reproduce it (as minimally and precisely as possible):

This is hard to reproduce since is fairly random. We can get 6, 7, 8 reload in one minute and then it could stay stable for few minutes or only for few seconds. It requires a dozen of such ingresses in order to trigger this behaviour. We have 3 namespaces and around 100 services per namespace deployed and we see this behaviour around 1-3 ingresses. But this is enough to create a lot of reloads, so that using UI apps becomes impossible (since they reload when websocket connection is re-established).

We also noticed that removing sub-domains leads to stable nginx configuration (since there is no need to generate default route). But since we can't stop using sub-domains, we hope that defining default backend will temporary solve issue until this bug is fixed.

Anything else we need to know:

I would suggest looking up how default route is being generated. It looks like it is using one of the ingresses for as subdomain as a template in order to get information about authentication, redirects, etc. and depending on which comes first it will generate configuration that is either same as current one or is different.

I noticed that you introduced sorting of backend in order to avoid unnecessary reloads, but this looks like a corner cases that was overlooked. If I knew go, I would have been probably submitting a patch right now :(.

King Regards,
Sasa Vilic

@sasavilic sasavilic changed the title Missing default backend caused unnecessary random reloads (causing termination of websocket) Missing default backend caused unnecessary reload of nginx (further causing termination of websocket) Jul 29, 2019
@aledbf
Copy link
Member

aledbf commented Jul 29, 2019

I noticed that you introduced sorting of backend in order to avoid unnecessary reloads, but this looks like a corner cases that was overlooked.

This is not an issue since the use of lua (0.18) to handle the balancing (no upstream sections in nginx.conf)

I would suggest looking up how default route is being generated.

How are you installing the ingress controller? Since 0.21 the default backend is not required anymore unless you are using a custom default backend.

Then after reload we would notice that nginx configuration for route / looks differently, and this time it uses redirect from another ingress and has no authentication defined (i.e. like for ingress foo-example-public-api).

Please add the flag --v=2 in the ingress controller to get exactly the diff of the configuration change that requires a reload? If you can please post an example of such dif (check if the difference is related to the content of the auth-response-headers annotation)

@sasavilic
Copy link
Author

sasavilic commented Jul 29, 2019

I noticed that you introduced sorting of backend in order to avoid unnecessary reloads, but this looks like a corner cases that was overlooked.

This is not an issue since the use of lua (0.18) to handle the balancing (no upstream sections in nginx.conf)

But you still need to sort locations, right?

I would suggest looking up how default route is being generated.

How are you installing the ingress controller? Since 0.21 the default backend is not required anymore unless you are using a custom default backend.

I have used your kubernetes template (kubectly apply -f ). We don't have default backend yet.

Then after reload we would notice that nginx configuration for route / looks differently, and this time it uses redirect from another ingress and has no authentication defined (i.e. like for ingress foo-example-public-api).

Please add the flag --v=2 in the ingress controller to get exactly the diff of the configuration change that requires a reload? If you can please post an example of such dif (check if the difference is related to the content of the auth-response-headers annotation)

Unfortunately, I can't give you complete log due to sensitive information inside. But I am attaching here part of configuration that has changed and corresponding ingress resource for that nginx configuration. (I have replaced domain name with company and namespace with example).

Just take diff between before.txt and after.txt and you will see what I am talking about.

before.txt
after.txt
ingress.txt

@aledbf
Copy link
Member

aledbf commented Sep 28, 2019

Closing. Please update to 0.26.0. The release contains fixes that avoid reloads in multiple scenarios.
Please reopen if the issue persists

@aledbf aledbf closed this as completed Sep 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants