-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingress-GCE has a nil pointer exception #471
Comments
/kind bug |
Is there a way to verify whether this is occurring in a GKE cluster? Since I don't think we have access to the logs, we can't check for the exception listed in #434, but we'd like to make sure this is the issue before recreating Ingresses. |
Unfortunately, no. If you are able to update your Ingress and see the changes reflected in GCP, then you should be fine. Otherwise, you are most likely hitting this issue. Also note that this is only happening is GKE clusters above version 1.10.6 |
I'm pulling my hair out trying to figure out why our ingresses suddenly stopped being fulfilled by the ingress controllers. Normally I've found a very reasonable explanation (Quotas, etc.), but this time I'm relatively sure we're running into this bug. kubernetes master version: 1.10.6-gke.2 We've tried deleting every ingress and recreating them, to no avail. Is there a time period I should wait before recreating the ingresses? I waited roughly 5 minutes this first time. |
@poor-bob Email me your project name, cluster name and location of the cluster and I'll take a look. If you deleted and recreated every ingress I would think that you would not be running into this specific issue. |
@rramkumar1 I have disabled the default GKE loadbalancer-controller and installed this version JUST to see logs. Indeed I am experiencing this issue:
If I delete and recreate each and every ingress from my project, can I expect to get past this nil pointer dereference issue? |
@addisonbair Theoretically yes. Since you installed another instance to get logs, you should be able to find out if that indeed works for you. |
Not related to the nil pointer issue, but my default backend disappears (both the service and deployment) without a trace:
Is there any way I can debug this? |
@addisonbair Can you file a separate issue for that and explain how exactly you are using the script in deploy/glbc? |
Will do. 👍 I believe I have a fix and in the process uncovered a possible bug with the yaml manifests. Since I don't have access to the masters (GKE) I can't be completely sure, but it appears there is a conflict between the Addon-manager running on the master and the annotations on the objects within |
After deleting all my Ingresses, I am unfortunately still seeing the NPE. Is there a known working pre-release image that I can use? Thank you!! |
@addisonbair We are in the process of building a patch with the fix and pushing it out. This will enable you to start testing the fix. Keep in mind that this does not mean it is released in GKE. You will still have to wait for an official GKE rollout and upgrade your cluster to get the fix. Will let you know when the patch is ready to pull down. |
Awesome. Thank you! |
Just a quick update: I managed to build an image from I'm happy to test a more formal image when it is ready. Thanks so much for the help! |
@addisonbair Thanks, that's great to hear! We just pushed v1.3.3 so please let us know if that works as well. This would be the version that would officially be rolled out as part of a new GKE version. |
Built, pushed and deployed v1.3.3 on my Thanks so much! |
Quick update for all tracking this issue. Hopefully the GKE rollout for the fix ends this week. I will ping this thread with the GKE version everyone should upgrade to once rollout is complete. |
Figure I would comment to clarify if anyone else is new to k8s since I was and it wasn't clear, you need to delete the LB as well as the associated services along with it. Once I did that and recreated the services & ingress I was able to work around this bug. Definitely not a great long-term solution but it worked until the patch is ready. Thanks @rramkumar1 for your assistance and confirmation of my issue! |
The GKE team released a new version, |
@laupow Thanks for the update. The fix should be rolled out as part of 1.10.7-gke.2 and 1.11.2-gke.4. /close |
@rramkumar1: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@rramkumar1 Are you sure this fix is live with 1.10.7-gke.2? I got this response from GCP Support last week:
|
@ericuldall I'm not sure why GCP support told you that. They may have gotten confused about something else. Do you not see 1.10.7-gke.2 as a viable version? |
I see it available, just unclear if the fix is actually deployed to that version or not. |
Yes, the fix is available in that version. |
Yes, I deployed it and my ingress was updated :D thanks for confirming! |
@rramkumar1 I have a cluster running on 1.10.6-gke.2 and I replaced one of the ingresses, then it got stuck in 'creating ingress'. I just found this thread this morning, and accordingly, deleted and then re-created the ingress but it's still showing up as 'creating ingress' in the GCP dashboard. Any ideas? |
@bschwartz757 You can upgrade to 1.10.7-gke.2. See above discussion |
@rramkumar1 ok..... anything that doesn't involve upgrading? |
@bschwartz757 Upgrading is the only supported way to get these kinds of fixes. If you don't want to upgrade, you can also run the script we have in deploy/. Note that this script is somewhat dangerous to run in production (and as a result, we don't officially support it) but it does allow you to modify the version of your ingress-gce controller without having to depend on GKE for upgrades |
@rramkumar1 just letting you know that I also am running into this issue on 1.11.5-gke.5. The creating ingress is stuck and I have deleted and recreated the ingress. Should I delete node and cluster and recreate at 1.10.7.gke.2? |
@Arconapalus This issue is already fixed for that version so you might be running into a separate issue. Can you please file a separate issue for this? |
@rramkumar1 Yes I can. |
@rramkumar1 I am also experiencing this issue on 1.11.6-gke.2. When I look at events of ingress details, there is no message at all. My ingress file is pretty simple as below. apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: basic-ingress
namespace: build
spec:
rules:
- http:
paths:
- path: /jenkins
backend:
serviceName: jenkins-ui
servicePort: 8080
- path: /nexus
backend:
serviceName: nexus-ui
servicePort: 8081
|
@rramkumar1 I'm having a similar issue to the one you have described in #605. I have sent you an e-mail with my setup but I'm also happy to continue the conversation online. |
We are aware of a nil pointer issue in v1.3.2. This bug was actually fixed in #434 but did not make it into the 1.3 release branch. Since this nil pointer crashes the controller, the issue is not surfaced to users other than Ingresses not being synced.
The current workaround is to delete the Ingress which is not being synced and recreate it. A fix will be coming in the next 1.3 patch release (v1.3.3)
The text was updated successfully, but these errors were encountered: