From 145dc50e5e86cb708ac5b6f3022bd48fd8b6acbe Mon Sep 17 00:00:00 2001 From: Fernando Diaz Date: Tue, 28 Aug 2018 13:00:19 -0500 Subject: [PATCH] Enhance Troubleshooting Documentation Enhances the troubleshooting documentation by adding a whole list of basic ways to troubleshoot. Also cleans up previous sections and adds information about the gdb. Fixes #2952 --- docs/troubleshooting.md | 200 +++++++++++++++++++++++++++++++++++----- 1 file changed, 176 insertions(+), 24 deletions(-) diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 4001ac3442..65e4d009fc 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -6,41 +6,114 @@ Do not move it without providing redirects. ----------------------------------------------- --> -# Debug & Troubleshooting +# Troubleshooting -## Debug +## Ingress-Controller Logs and Events -Using the flag `--v=XX` it is possible to increase the level of logging. -In particular: +There are many ways to troubleshoot the ingress-controller. The following are basic troubleshooting +methods to obtain more information. -- `--v=2` shows details using `diff` about the changes in the configuration in nginx +Check the Ingress Resource Events +``` +$ kubectl get ing -n +NAME HOSTS ADDRESS PORTS AGE +cafe-ingress cafe.com 10.0.2.15 80 25s + +$ kubectl describe ing -n +Name: cafe-ingress +Namespace: default +Address: 10.0.2.15 +Default backend: default-http-backend:80 (172.17.0.5:8080) +Rules: + Host Path Backends + ---- ---- -------- + cafe.com + /tea tea-svc:80 () + /coffee coffee-svc:80 () +Annotations: + kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{},"name":"cafe-ingress","namespace":"default","selfLink":"/apis/extensions/v1beta1/namespaces/default/ingresses/cafe-ingress"},"spec":{"rules":[{"host":"cafe.com","http":{"paths":[{"backend":{"serviceName":"tea-svc","servicePort":80},"path":"/tea"},{"backend":{"serviceName":"coffee-svc","servicePort":80},"path":"/coffee"}]}}]},"status":{"loadBalancer":{"ingress":[{"ip":"169.48.142.110"}]}}} + +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal CREATE 1m nginx-ingress-controller Ingress default/cafe-ingress + Normal UPDATE 58s nginx-ingress-controller Ingress default/cafe-ingress +``` -```console -I0316 12:24:37.581267 1 utils.go:148] NGINX configuration diff a//etc/nginx/nginx.conf b//etc/nginx/nginx.conf -I0316 12:24:37.581356 1 utils.go:149] --- /tmp/922554809 2016-03-16 12:24:37.000000000 +0000 -+++ /tmp/079811012 2016-03-16 12:24:37.000000000 +0000 -@@ -235,7 +235,6 @@ +Check the Ingress Controller Logs +``` +$ kubectl get pods -n +NAME READY STATUS RESTARTS AGE +nginx-ingress-controller-67956bf89d-fv58j 1/1 Running 0 1m + +$ kubectl logs -n nginx-ingress-controller-67956bf89d-fv58j +------------------------------------------------------------------------------- +NGINX Ingress controller + Release: 0.14.0 + Build: git-734361d + Repository: https://github.com/kubernetes/ingress-nginx +------------------------------------------------------------------------------- +.... +``` - upstream default-http-svcx { - least_conn; -- server 10.2.112.124:5000; - server 10.2.208.50:5000; +Check the Nginx Configuration +``` +$ kubectl get pods -n +NAME READY STATUS RESTARTS AGE +nginx-ingress-controller-67956bf89d-fv58j 1/1 Running 0 1m + +$ kubectl exec -it -n nginx-ingress-controller-67956bf89d-fv58j cat /etc/nginx/nginx.conf +daemon off; +worker_processes 2; +pid /run/nginx.pid; +worker_rlimit_nofile 523264; +worker_shutdown_timeout 10s; +events { + multi_accept on; + worker_connections 16384; + use epoll; +} +http { +.... +``` - } -I0316 12:24:37.610073 1 command.go:69] change in configuration detected. Reloading... +Check if used Services Exist +``` +$ kubectl get svc --all-namespaces +NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +default coffee-svc ClusterIP 10.106.154.35 80/TCP 18m +default kubernetes ClusterIP 10.96.0.1 443/TCP 30m +default tea-svc ClusterIP 10.104.172.12 80/TCP 18m +kube-system default-http-backend NodePort 10.108.189.236 80:30001/TCP 30m +kube-system kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP 30m +kube-system kubernetes-dashboard NodePort 10.103.128.17 80:30000/TCP 30m ``` -- `--v=3` shows details about the service, Ingress rule, endpoint changes and it dumps the nginx configuration in JSON format -- `--v=5` configures NGINX in [debug mode](http://nginx.org/en/docs/debugging_log.html) +## Debug Logging -## Troubleshooting +Using the flag `--v=XX` it is possible to increase the level of logging. This is performed by editing +the deployment. +``` +$ kubectl get deploy -n +NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE +default-http-backend 1 1 1 1 35m +nginx-ingress-controller 1 1 1 1 35m -### Authentication to the Kubernetes API Server +$ kubectl edit deploy -n nginx-ingress-controller +# Add --v=X to "- args", where X is an integer +``` +- `--v=2` shows details using `diff` about the changes in the configuration in nginx +- `--v=3` shows details about the service, Ingress rule, endpoint changes and it dumps the nginx configuration in JSON format +- `--v=5` configures NGINX in [debug mode](http://nginx.org/en/docs/debugging_log.html) + +## Authentication to the Kubernetes API Server A number of components are involved in the authentication process and the first step is to narrow -down the source of the problem, namely whether it is a problem with service authentication or with the kubeconfig file. +down the source of the problem, namely whether it is a problem with service authentication or +with the kubeconfig file. + Both authentications must work: ``` @@ -88,6 +161,7 @@ Kubernetes Workstation ``` ### Service Account + If using a service account to connect to the API server, Dashboard expects the file `/var/run/secrets/kubernetes.io/serviceaccount/token` to be present. It provides a secret token that is required to authenticate with the API server. @@ -177,6 +251,84 @@ More information: * [User Guide: Service Accounts](http://kubernetes.io/docs/user-guide/service-accounts/) * [Cluster Administrator Guide: Managing Service Accounts](http://kubernetes.io/docs/admin/service-accounts-admin/) -### Kubeconfig -If you want to use a kubeconfig file for authentication, follow the deploy procedure and -add the flag `--kubeconfig=/etc/kubernetes/kubeconfig.yaml` to the deployment +## Kube-Config + +If you want to use a kubeconfig file for authentication, follow the [deploy procedure](../docs/deploy/index.md) and +add the flag `--kubeconfig=/etc/kubernetes/kubeconfig.yaml` to the args section of the deployment. + +## Using GDB with Nginx + +[Gdb](https://www.gnu.org/software/gdb/) can be used to with nginx to perform a configuration +dump. This allows us to see which configuration is being used, as well as older configurations. + +Before starting make sure that nginx is running with the `--with-debug`. See the `Debug Logging` section seen above. +Note: The below is based on the nginx [documentation](https://docs.nginx.com/nginx/admin-guide/monitoring/debugging/#dumping-nginx-configuration-from-a-running-process). + +1. SSH into the worker + ``` + $ ssh user@workerIP + ``` + +2. Obtain the Docker Container Running nginx + ``` + $ docker ps | grep nginx-ingress-controller + CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES + d9e1d243156a quay.io/kubernetes-ingress-controller/nginx-ingress-controller "/usr/bin/dumb-init …" 19 minutes ago Up 19 minutes k8s_nginx-ingress-controller_nginx-ingress-controller-67956bf89d-mqxzt_kube-system_079f31ec-aa37-11e8-ad39-080027a227db_0 + ``` + +3. Exec into the container + ``` + $ docker exec -it --user=0 --privileged d9e1d243156a bash + ``` + +4. Make sure nginx is running in `--with-debug` + ``` + $ nginx -V 2>&1 | grep -- '--with-debug' + ``` + +5. Install gdb + ``` + $ apt-get update; apt-get install gdb -y + ``` + +6. Get list of processes running on container + ``` + $ ps -ef + UID PID PPID C STIME TTY TIME CMD + root 1 0 0 20:23 ? 00:00:00 /usr/bin/dumb-init /nginx-ingres + root 5 1 0 20:23 ? 00:00:05 /nginx-ingress-controller --defa + root 21 5 0 20:23 ? 00:00:00 nginx: master process /usr/sbin/ + nobody 106 21 0 20:23 ? 00:00:00 nginx: worker process + nobody 107 21 0 20:23 ? 00:00:00 nginx: worker process + root 172 0 0 20:43 pts/0 00:00:00 bash + ``` + +7. Attach gdb to the nginx master process + ``` + $ gdb -p 21 + .... + Attaching to process 21 + Reading symbols from /usr/sbin/nginx...done. + .... + (gdb) + ``` + +8. Copy and paste the following: + ``` + set $cd = ngx_cycle->config_dump + set $nelts = $cd.nelts + set $elts = (ngx_conf_dump_t*)($cd.elts) + while ($nelts-- > 0) + set $name = $elts[$nelts]->name.data + printf "Dumping %s to nginx_conf.txt\n", $name + append memory nginx_conf.txt \ + $elts[$nelts]->buffer.start $elts[$nelts]->buffer.end + end + ``` + +9. Quit GDB by pressing CTRL+D + +10. Open nginx_conf.txt + ``` + cat nginx_conf.txt + ``` \ No newline at end of file