-
Notifications
You must be signed in to change notification settings - Fork 172
Cherrypick: Handle source object overwriting conflict #1115
Conversation
For source objects with a conflicting source in the ancestors, treat them as propagated objects and overwrite them with the source in the ancestor. See http://bit.ly/hnc-propagation-conflict Tested by `make test` with a new integration test case.
Update object webhook to deny creating an object if an object with the same name and type already exits in the descendants, and the type has 'Propagate' syncMode. Add new a new test case. Tested manually on GKE cluster and by 'make test'. The webhook output is like this: Error from server (Conflict): admission webhook "objects.hnc.x-k8s.io" denied the request: Cannot create /v1, Kind=Secret 'my-creds' in namespace 'acme-org' because it would overwrite objects in the following descendant namespace(s): [team-a team-b]. Choose a different name for the object, or remove the conflicting objects from these namespaces.
Update the 5-millisecond sleep to 500-millisecond before detecting object overwriting for object reconciler test. This should fix the flaky test on Prow presubmit. Tested by 'make test'. It was not flaky before and after the change locally but may help for Prow.
Add an additional rule to make sure there's no potential object overwriting in hierarchy changes. Format the webhook message to: " Cannot update hierarchy because it would overwrite the following object(s): * Namespace "team-b": my-creds (/v1, Kind=Secret) * Namespace "team-b": my-creds2 (/v1, Kind=Secret) To fix this, please rename or remove the conflicting objects first. " Format the object webhook message to: " Cannot create "my-creds2" (/v1, Kind=Secret) in namespace "acme-org" because it would overwrite objects in the following descendant namespace(s): * team-a * team-b To fix this, choose a different name for the object, or remove the conflicting objects from the above namespaces. " Tested manually on GKE cluster and by 'make test' with new test cases.
/assign @adrianludwin This is the cherry-pick of the Object and HierarchyConfig webhooks to HNC v0.5, to prevent creating conflicting objects and changing parents that would cause this conflict. |
/retest |
This has four commits, but I thought it was only supposed to have two. Is that intended? |
Yes, besides the object & hc webhook changes, there are also one for object reconciler behavior change to overwriting conflicting objects (this is the first commit), and another one to make the test less flaky with a longer sleep time for the new tests included. |
Given that this is a change onto a stable branch, can you please do a bit more testing on it? It would be nice to see the full suite of e2e tests at least. Thanks! Other than that this lgtm. @rjbez17 , what do you think of checking this into v0.5.3? It solves most of the problem that we identified that you can have silently overridden policies while not being so dangerous that they all get overwritten. The one dangerous part of this change is that if you turn on propagation for a new object type, any conflicting objects will get swept away; we'll fix that in v0.6 but it's too hard to fix in v0.5.x. But given the overall stability of HNC, I think that's an acceptable compromise. |
Sure, will do. BTW just confirming that we don't have e2e test checked in in v0.5, I will have to manually add the e2e test files on my local branch to test, right? @adrianludwin |
As discussed offline, at least some of them are there:
https://github.com/kubernetes-sigs/multi-tenancy/tree/hnc-v0.5/incubator/hnc/test/e2e.
You may need to say `make e2e-test` instead of `make test-e2e` IIRC.
…On Fri, Sep 18, 2020 at 9:12 AM Yiqi Gao ***@***.***> wrote:
Given that this is a change onto a stable branch, can you please do a bit
more testing on it? It would be nice to see the full suite of e2e tests at
least. Thanks!
Sure, will do. BTW just confirming that we don't have e2e test checked in
in v0.5, I will have to manually add the e2e test files on my local branch
to test, right? @adrianludwin <https://github.com/adrianludwin>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE43PZFA5V22X6KN5HJ3HVLSGNMB3ANCNFSM4RQVLATQ>
.
|
Update: among 24 e2e tests, 22 passed and 2 failed:
They both failed on webhook internal error on
Please note other HC changes such as |
INTERNAL_ERROR sounds like it could be a real error - check the logs? All
tests were passing robustly on 0.5.2.
…On Fri, Sep 18, 2020 at 2:42 PM Yiqi Gao ***@***.***> wrote:
Update: among 24 e2e tests, 22 passed and 2 failed:
[Fail] Issues [It] Should remove obsolete conditions CannotPropagateObject and CannotUpdateObject - issue #328
[Fail] Issues [It] Should clear CannotUpdate conditions in descendants when a hierarchy changes - issue #605
They both failed on webhook internal error on kubectl hns set sub-child
--root:
Running: [kubectl hns set sub-child --root]
Output: Unsetting the parent of sub-child (was previously child)
Could not update the hierarchical configuration of sub-child.
Reason: Internal error occurred: failed calling webhook "hierarchyconfigurations.hnc.x-k8s.io": Post https://hnc-webhook-service.hnc-system.svc:443/validate-hnc-x-k8s-io-v1alpha1-hierarchyconfigurations?timeout=30s: stream error: stream ID 13139; INTERNAL_ERROR
Please note other HC changes such as kubectl hns set sub-child --parent
child all passed. I'm looking into the issue now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE43PZEOAVR5D62I3SRPQWLSGOS2TANCNFSM4RQVLATQ>
.
|
The bug was found when running e2e tests that the webhook got INTERNAL_ERROR when setting a namespace as root. Fix it by checking if the new parent is nil. If yes, early exit since it's impossible to introduce new naming conflicts. Add a new unit test case for it. The test failed before and passed after the change. Tested by make test and manually.
Update: the e2e test error is fixed and cherry-picked into this PR too. I reran the e2e tests focusing ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/hold
/assign @rjbez17
This is a larger change (both LoC and behaviour) than I'd usually want to put on a branch, but I think it makes sense. Our current behaviour re conflicting objects is clearly broken and this seems like a reasonable approach. Ideally a couple of us will play around with this before we release 0.5.3 to see if any other issues crop up with it.
@rjbez17 , wdyt?
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adrianludwin, yiqigao217 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
See issue at #1076
The fix should include 4 changes, but we only include the first 2 fixes in this cherry-pick into HNC v0.5.
Tested by
make test