-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document how to avoid 502s #34
Comments
From @nicksardo on September 26, 2017 16:12 https://serverfault.com/questions/849230/is-there-a-way-to-use-customized-502-page-for-load-balancer |
From @esseti on September 29, 2017 9:53 Regarding the several 502, i found out that it's due to how long the LB keeps the connection alive vs what the container provides as keepalive timout. it's explained here (point 3) https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340 In my case I also added a timout of 5s to the probe, not sure but that solved the 502 (i'm using uwsgi) |
I also get google's 502 html error page and would like to understand why and how to avoid it or customize the response. The backend pods have been running without restarting but still maybe 1/1000 requests return a 502. Using GKE with ingress that sends to API pod running nginx. |
@montanaflynn have you tried this (point 3) https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340 ? (i also had to increase the time of probeness, there are my comments on that page). |
Using GKE I'm getting these too, and backend services have 2/2 for cluster health and green. Using Ingress with Updated liveness and readiness probes, and confirmed 200 responses both at probe URI and Redacted Ingress ConfigapiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: staging-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: "kubernetes-ingress-stg"
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "gce"
spec:
tls:
- hosts:
- eval.redacted-site1.com
- eval.redacted-site2.com
secretName: legacy-tls
rules:
- host: eval.redacted-site1.com
http:
paths:
- path: /*
backend:
serviceName: site1-app
servicePort: 80
- host: eval.redacted-site2.com
http:
paths:
- path: /*
backend:
serviceName: site2-app
servicePort: 80 |
@esseti I tried increasing the timeout as suggested but still get 502s Also like @mike-saparov we're using TLS with the ingress (not acme) and |
I increased keepalive too per recommendation and didn't fix. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
We're also having these issues. Using NodePort on our services, TLS ingress with kube-lego. We noticed 502 right after this message showed up in
Any ideas how to figure out what is causing these 502's? |
/remove-lifecycle stale |
I'm seeing these too. Using 1.8.6 with Kubefed trying to set up a federated Ingress (which I think I got set up), but I now keep getting 502s, and there's nothing to log/debug except stackdriver, which shows 502s My backends occasionally say "unhealthy" though for some reason... Even though I've fulfilled this comment: |
I increased the VM size and errors subsided. It appears the containers would crash due to memory limits, then as new ones spun up the health check failed and Ingress served up 502.
No smoking gun but that solved for me. You may be underutilized.
…Sent from my iPhone
On Mar 30, 2018, at 5:47 PM, George ***@***.***> wrote:
I'm seeing these too. Using 1.8.6 with Kubefed trying to set up a federated Ingress (which I think I got set up), but I now keep getting 502s, and there's nothing to log/debug except stackdriver, which shows 502s
My backends occasionally say "unhealthy" though for some reason...
Even though I've fulfilled this comment:
Services exposed through an Ingress must serve a response with HTTP 200 status to the GET requests on / path. This is used for health checking. If your application does not serve HTTP 200 on /, the backend will be marked unhealthy and will not get traffic.
https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The simple way to avoid 502s: setup a cluster that hosts only your app and does not use preemptible nodes or cluster node-pool autoscaling. Schedule app downtime for node upgrades. If you want to avoid 502s and want cluster autoscaling, preemptibe nodes or zero downtime, you probably need to switch to the nginx ingress controller. The L7 load balancer lives in the cluster and is able to respond faster and more proactively to events occurring in the cluster. The built-in retry logic also helps. The GCE ingress controller creates an L7 load balancer that communicates to a kubernetes NodePort service. If you use the default settings for your service, Setting |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
This thread was helpful for [crossing my fingers for now] eliminating my 502s. Just in case they come back though, is it possible to customize the response body? Have tried digging quite a bit without luck so guessing no, but asking here to be extra sure [noticed the title of this issue originally reference personalizing 502s as well] |
/lifecycle frozen |
I believe some of this issues will be solved by the new Network Endpoints Group load balancing (https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing) due to the removed network hop in the kube-proxy |
would you be able to deploy container-native-load-balancing alongside nginx-ingress-controller? |
Hi, where exactly can I set the two NGINX settings described?
I have an Ingress based on |
I believe you must create a ConfigMap and that is what overrides Nginx. I'm
currently using GCE ingress but if memory serves, that is what you need to
add.
…On Tue, Feb 12, 2019 at 12:15 PM Stefan ***@***.***> wrote:
Hi, where exactly can I set the two NGINX settings described?
keepalive_timeout 650;
keepalive_requests 10000;
I have an Ingress based on nginx-ingress-controller. How exactly can I
pass these to the NGINX used in the image?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFS70ZTTna19sR5f-1A0NsNV--ADqpWzks5vMxLugaJpZM4P13yZ>
.
|
Here's an example that might get you started, Stefan:
https://github.com/nginxinc/kubernetes-ingress/blob/master/docs/configmap-and-annotations.md
…On Tue, Feb 12, 2019 at 1:04 PM Mike Sparr ***@***.***> wrote:
I believe you must create a ConfigMap and that is what overrides Nginx.
I'm currently using GCE ingress but if memory serves, that is what you need
to add.
On Tue, Feb 12, 2019 at 12:15 PM Stefan ***@***.***> wrote:
> Hi, where exactly can I set the two NGINX settings described?
>
> keepalive_timeout 650;
> keepalive_requests 10000;
>
> I have an Ingress based on nginx-ingress-controller. How exactly can I
> pass these to the NGINX used in the image?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#34 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AFS70ZTTna19sR5f-1A0NsNV--ADqpWzks5vMxLugaJpZM4P13yZ>
> .
>
|
Perfect. Thank you so much @mikesparr |
Is the solution here to move to nginx ingress controller? This seems like a good workaround. Is there any downsides to doing this? Will it still work with a nodeport service? |
Normally the 502s are from a failed health check and I've found playing
around with the initialDelaySeconds on readiness probe, etc. to provide
ample time for Docker build / deploy reduced a lot. Furthermore, I
Dockerized some legacy stuff from merger in PHP using sessions so the
health check in correlation to out of memory in pod forcing destroy /
rebuild are main causes.
The health check intervals are probably first place to tweak but it is
trial/error depending on your app. I am directing all our projects to
leverage Docker's multi-stage build v17.1 and later and in Node apps we saw
image size reduce from 225MB to 71MB, further speeding up deployments and
minimizing health check timeout risk. Golang images are under 10MB in some
cases so they are awesome. ;-)
Hope that helps.
…On Wed, Mar 13, 2019 at 3:19 PM keperry ***@***.***> wrote:
Is the solution here to move to nginx ingress controller? This seems like
a good workaround. Is there any downsides to doing this? Will it still work
with a nodeport service?
Seems a little crazy to me that this isn't fixed in the gce controller.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFS70R2vUxF-wTUu7hffLjm2yIsFwD0Fks5vWWtqgaJpZM4P13yZ>
.
|
Hi, I'm still experiencing this issue, although I added the |
There are something you could try.
|
Ref: #769 |
I sent a feedback about that containers should return 200 at / here: |
Changing externalTrafficPolicy from Local to Cluster has fixed seemingly random 502 errors with low or single-replica deployments for one project.
…Sent from my iPhone
On May 14, 2020, at 3:12 AM, OpenMinded ***@***.***> wrote:
I sent a feedback about that containers should return 200 at / here:
https://cloud.google.com/kubernetes-engine/docs/how-to/load-balance-ingress
That would improve understanding at that point I think.
Perhaps more people can do that so it will be added there.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@mikesparr comment above helped me solve my issue. It was actually throwing a CORS error. My app was 3 replica deployment. This thread was particularly useful in solving the issue. Thank you. |
but if we change we cannot preserve client IP address, real client IP |
From @esseti on September 20, 2017 9:22
Hello,
i've a problem with the ingress and the fact that the 502 page pops up when there are "several" request. I've a JMeter spinning 10 threads for 20 times, and I get more than 50 times the 502 over 2000 calls in total (less than 0,5%).
reading the readme it says
it says that this error is probably due to
but the loadbalancer is already there, so does it means that all the pods serving that url are busy? is there a way to avoid the 502 waiting for a pod to be free?
if not, is there a way to personalize the 502 page? because I expose APIs in JSON format, and I would like to show a JSON error rather than a html page.
Copied from original issue: kubernetes/ingress-nginx#1396
The text was updated successfully, but these errors were encountered: