-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not Delete LB in Case of Security Group Reconciliation Errors #743
Do not Delete LB in Case of Security Group Reconciliation Errors #743
Conversation
Welcome @multi-io! |
Hi @multi-io. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Build succeeded.
|
/ok-to-test |
Build succeeded.
|
Build failed.
|
Build succeeded.
|
Build succeeded.
|
Build failed.
|
Build failed.
|
Build failed.
|
Build failed.
|
Build succeeded.
|
Build succeeded.
|
afd44da
to
0942580
Compare
force-pushed commit message typo fix |
Build failed.
|
Build succeeded.
|
Build succeeded.
|
Build succeeded.
|
Build failed.
|
Build succeeded.
|
Build succeeded.
|
Build succeeded.
|
0942580
to
f365c4e
Compare
Build failed.
|
Build succeeded.
|
force-pushed better error message. I'll stop pushing now unless asked to do it. |
Build succeeded.
|
Build succeeded.
|
Build succeeded.
|
Build failed.
|
Build failed.
|
Build failed.
|
Build failed.
|
Build succeeded.
|
Build succeeded.
|
Build succeeded.
|
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adisky The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Yes please, I don't think the CI job failure is related to this PR. |
BTW, is there someone working on fixing the CI? |
@lingxiankong for conformance we are working, openlab team is looking on octavia job |
…e_lb_on_errors Do not Delete LB in Case of Security Group Reconciliation Errors
What this PR does / why we need it:
This fixes the lbaas control loop's EnsureLoadBalancer() function so it no longer deletes the LB if something went wrong when reconciling the LB's security groups. With the current master, if you have an LB service and associated LB already up and running and working fine, and then during a reconcile loop (which shouldn't change anything) e.g. the OpenStack API is down temporarily at the wrong moment (i.e. if it's still up during the LB and listener reconciliation, but then down during the SG reconciliation), then the whole LB will be deleted. We saw this exact thing happen in a real world customer application, which went offline because of if (the LB is recreated shortly after, but likely with a different floating IP).
Deleting the LB in case of errors in a "reconcile" (rather than "create") function seems just wrong, and all the other parts of EnsureLoadBalancer() don't do it either: E.g. if a transient error occurs when creating a listener, we just return it and leave the LB in a half-created state (
cloud-provider-openstack/pkg/cloudprovider/providers/openstack/openstack_loadbalancer.go
Lines 1033 to 1036 in 8f386af
This PR just fixes the SG reconciliation to follow the same pattern. It seems to me that the current "delete LB in case an an error" approach was originally not part of a "reconcile" function but of a "create" function, where it would've made more sense.
The same bug is present in the legacy in-tree openstack cloud provider; I've submitted a corresponding PR there (kubernetes/kubernetes#82264).
Which issue this PR fixes
kubernetes/kubernetes#35056
Special notes for your reviewer:
Release note: