Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE instance groups are not updated when nodes are added/removed #1012

Closed
nikhiljindal opened this issue Jul 25, 2017 · 10 comments · Fixed by #1024
Closed

GCE instance groups are not updated when nodes are added/removed #1012

nikhiljindal opened this issue Jul 25, 2017 · 10 comments · Fixed by #1024
Assignees

Comments

@nikhiljindal
Copy link
Contributor

Looking at the code (https://github.com/kubernetes/ingress/blob/a58b80017170eecbe8b2d6573b66192cafe0d32a/controllers/gce/controller/controller.go#L177), it looks like we are not doing anything when nodes are added and deleted. We should be updating the instance group when that happens.

Am I missing something?

@nikhiljindal
Copy link
Contributor Author

@nicksardo pointed out that the logic was removed in c7c2a56#diff-9141c651905f3492033cf255f8e12fd7L176.

@aledbf Was that intentional?

@nicksardo
Copy link
Contributor

Quick FYI: the instance groups are eventually synced when the resync period occurs for any ingress. This may explain why nobody has noticed this in practice.

@nicksardo
Copy link
Contributor

This still occurs despite having the fix. I'm guessing that the handler is called when the node is added; however, it's NotReady state prevents the instance group from being updated. I don't know why the handler isn't called again when the node becomes Ready.

@aledbf
Copy link
Member

aledbf commented Aug 1, 2017

@nicksardo maybe because the handler does not have an UpdateFunc?

@nicksardo
Copy link
Contributor

Ah, that would certainly do it. According to the comment, we don't want to have an UpdateFunc due to the frequency. @nikhiljindal and I need to look more closely at the instance sync logic.

@nikhiljindal
Copy link
Contributor Author

If there was a way to find out the update, we could add an UpdateFunc that adds the node to queue only if there has been a "relevant change" in node. Node's heartbeat being updated, for example, is not a "relevant change".

@nicksardo
Copy link
Contributor

Right, not a quick change though.

@nicksardo
Copy link
Contributor

Looks like removing nodes from instances groups has never worked.

I0803 00:20:41.239780       5 instances.go:234] Syncing nodes [gke-testingress-default-pool-9818f010-zcxl gke-testingress-default-pool-9818f010-h3b8 gke-testingress-default-pool-9818f010-jk3f]
I0803 00:20:41.367945       5 instances.go:265] Removing nodes from IG: [gke-testingress-default-pool-9818f010-6znn gke-testingress-default-pool-9818f010-dz9r]
E0803 00:20:41.368941       5 instances.go:191] Failed to get zones for gke-testingress-default-pool-9818f010-6znn: node not found gke-testingress-default-pool-9818f010-6znn, skipping
E0803 00:20:41.369292       5 instances.go:191] Failed to get zones for gke-testingress-default-pool-9818f010-dz9r: node not found gke-testingress-default-pool-9818f010-dz9r, skipping
I0803 00:20:41.369599       5 utils.go:212] Syncing gke-testingress-default-pool-9818f010-8ffx
I0803 00:20:41.369912       5 instances.go:234] Syncing nodes [gke-testingress-default-pool-9818f010-zcxl gke-testingress-default-pool-9818f010-h3b8 gke-testingress-default-pool-9818f010-jk3f]
I0803 00:20:41.479824       5 instances.go:265] Removing nodes from IG: [gke-testingress-default-pool-9818f010-6znn gke-testingress-default-pool-9818f010-dz9r]
E0803 00:20:41.480328       5 instances.go:191] Failed to get zones for gke-testingress-default-pool-9818f010-6znn: node not found gke-testingress-default-pool-9818f010-6znn, skipping
E0803 00:20:41.481218       5 instances.go:191] Failed to get zones for gke-testingress-default-pool-9818f010-dz9r: node not found gke-testingress-default-pool-9818f010-dz9r, skipping
I0803 00:20:46.453664       5 utils.go:212] Syncing gke-testingress-default-pool-9818f010-dz9r
I0803 00:20:46.454195       5 instances.go:234] Syncing nodes [gke-testingress-default-pool-9818f010-h3b8 gke-testingress-default-pool-9818f010-jk3f gke-testingress-default-pool-9818f010-zcxl]
I0803 00:20:46.545582       5 utils.go:212] Syncing gke-testingress-default-pool-9818f010-6znn
I0803 00:20:46.546366       5 instances.go:234] Syncing nodes [gke-testingress-default-pool-9818f010-h3b8 gke-testingress-default-pool-9818f010-jk3f gke-testingress-default-pool-9818f010-zcxl]

@nicksardo nicksardo mentioned this issue Aug 3, 2017
9 tasks
@nicksardo
Copy link
Contributor

@bowei
Copy link
Member

bowei commented Oct 11, 2017

This issue was moved to kubernetes/ingress-gce#50

@bowei bowei closed this as completed Oct 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants