-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Help wanted] Ingress controller exiting/shutting down unexpectedly #612
Comments
Hi, the liveness will send a sigterm to the controller if haproxy fails to answer the health check:
Try to remove the liveness and check also if haproxy is propery configured. |
@jcmoraisjr using empty config map for IC. Where should I check this?
|
@jcmoraisjr where should I check if HAproxy is properly configured or not? If you can point to that in manifest shared above, that would be a great help. |
You also didn't configure any resource requests/limits so it can be that the scheduler is killing it because the node is out of memory or something (not necessarily because of this pod, it usually doesn't consume much of anything). This can be checked by |
Thanks @Unichron I will fix that too, only reason i haven't done that yet because I'm bit unsure about resource usage of HAproxy IC. |
@transhapHigsn Well, the controller itself doesn't consume much, except maybe if you have huge number of kubernetes objects to track (e.g. ingresses, services, pods). The underlying haproxy is also extremely efficient, but the requirements can vary depending on your load. I would suggest looking at the relevant haproxy docs for guidance in this: http://cbonte.github.io/haproxy-dconv/2.0/management.html#6 |
The daemonset object has a liveness probe which might be failing for any reason. A failing liveness probe will stop haproxy ingress pretty much like this. Check also events, either via |
Thanks @jcmoraisjr @Unichron for this. I will check out if this works for me. |
@jcmoraisjr @Unichron It just happened again. In events, it is showing up following error.
I have removed liveness probe for now, but I have come to find out that this usually happens when request rate is higher than usual. Will scheduling IC on multiple nodes help here? Someone referred that, timeout issue can be due to this: k3s-io/k3s#1266 . what do you think? |
Hi, your haproxy proxies might be saturated and taking much time to answer requests and also the health check. If you're not using v0.10 (beta but good enough for production) give it a chance (see any backward compatibility issues in the changelog) and configure prometheus, doc here. Scheduling a few more controllers should help. Otoh if your current ingress nodes has less or the same number of cores than the number of threads (default to 2 since v0.8, doc here) you can upgrade your host spec. Note also that increasing the number of threads doesn't increase maximum conn/s and req/s in the same rate, eg on our environment we cannot see any gain using 5 threads or more, so we increase the number of controllers. |
@jcmoraisjr I will check this out, and get back. |
@jcmoraisjr After making above changes, the performance is consistent and optimal and I haven't seen any of the previous errors. Although today I observed that requests were not forwarded towards a newly spawned running pod in a deployment (of 2 replicas), this lead to increase in request timeouts for some time. I am not exactly sure what caused it, and I am not able to replicate it again. Thanks for all your help. |
I am using HAProxy IC with k3s. But, i am seeing unexpected shutdown/exiting of ingress controller. I have increased the verbosity level to see if there is any resource-level error, but i didn't find anything.
K3S version: v1.18.2+k3s1 (without traefik)
Ingress controller manifest:
Exiting logs:
@jcmoraisjr Can you help me out on this? I am not sure how to debug this either.
The text was updated successfully, but these errors were encountered: