-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix networking resources deletion of child resources during adoption #3136
Fix networking resources deletion of child resources during adoption #3136
Conversation
if err != nil { | ||
return errors.Wrap(err, "failed during comparison for embeddedInboundNatRuleJSON and inboundNatRuleJSON") | ||
} | ||
// TODO: Can't do a trivial fuzzyEqualityComparison here because we don't know which fields are readonly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my least favorite part of this change. I think it's pretty safe though because before we were using the child resource shape and the parent resource shape and the assertion here was effectively "are they actually the same".
Now we're using the parent resource shape (from GET) and putting that into the parent resource shape (for PUT). Since it's the same resource w/ the same API version, the properties are guaranteed to line-up structurally. Properties that exist on GET but not on PUT are read-only, so it's OK for them to be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. A few comments, including a possibility to simplify the code using reflection.
func getRawInboundNatRulesField(lb map[string]any) ([]any, error) { | ||
props, ok := lb["properties"] | ||
if !ok { | ||
return nil, errors.Errorf("couldn't find properties field on LoadBalancer") | ||
} | ||
|
||
propsMap, ok := props.(map[string]any) | ||
if !ok { | ||
return nil, errors.Errorf("properties field wasn't a map") | ||
} | ||
|
||
inboundNatRules, ok := propsMap["inboundNatRules"] | ||
if !ok { | ||
return nil, errors.Errorf("couldn't find inboundNatRules field on LoadBalancer") | ||
} | ||
|
||
inboundNatRulesSlice, ok := inboundNatRules.([]any) | ||
if !ok { | ||
return nil, errors.Errorf("inboundNatRules field wasn't a slice") | ||
} | ||
|
||
return inboundNatRulesSlice, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid the repetition here? What about defining a dynamicObject
as an alias for map[string]any
as a container for some helper methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored things to share a bunch more code.
LoadBalancer is a bit different because it needs to merge rather than overwrite, and that got more complicated with this refactor because now we're looking at Azure state which means that we need to deal with conflicts consistently (previously we merged state from LB in ASO and Routes in ASO, now we're merging LB from ASO w/ LB from Azure, and since LB from Azure will have the routes defined on LB in ASO we have to make sure we choose the ASO ones). So specifically for loadbalancer it doesn't use all of the shared logic it has a bit of its own for that, but the rest of the examples have significantly reduced duplication and even LB makes use of a good bit of that.
if !foundExistingRule { | ||
resultSlice = reflect.Append(resultSlice, inboundNatRule) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is order significant? This puts all the extras at the end of the slice, so any that are known by ASO will be promoted in front of the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good question. Let me check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order doesn't seem to matter. The networking RP seems to return the collection in order of creation time, so the oldest NatRules are first and the newer ones are last. Whatever order you supply rules to it in, that doesn't change.
For our purposes because we're doing a merge here based on name, it doesn't make a different to us I don't think, so the order we end up with here should be OK.
617f0f1
to
44fc044
Compare
/ok-to-test sha=44fc044 |
Networking resources have a structure where a PUT on the parent also can delete child resources (or add them). ASO attempts to allow users to manage these networking child resources independent of the owner. In order to do that, when building the PUT payload for the parent resource we used the
ownerReferences.uid
field to look up the child resources owned by the parent resource we were about to send and include them all in the PUT payload of the owner so that they are not deleted.The issue is that with adoption, parent + child already exist in Azure, but
ownerReferences.uid
isn't set yet in Kubernetes, which means the PUT of the parent doesn't include any reference to its children and so PUT parent ends up deleting the children.This issue impacts the following resources:
We investigated using
.spec.owner
instead ofownerReferences
, which works, but inconsistently as there is still a race in Kubernetes client caching when the parent is created during adoption. Sometimes it realizes that it has children and sometimes it doesn't.Instead, we rely on Azure itself, issuing a GET parent while building the PUT payload. We use the exact contents of the GET response as the PUT payload for the child resource field (
subnets
for VNET,rules
for LoadBalancer, etc). This ensures that we don't ever delete any child resource when PUT-ing the parent, even if we're adopting that parent for the first time.Note that technically GET + PUT is subject to a dual-writer race, which we could use ETag and
If-Match
headers to prevent. We didn't implement that here because:Given the above it didn't make sense to bother with optimistic concurrency.
Closes #3077
If applicable: