-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding GEP-709: Cross namespace references from Routes #711
Adding GEP-709: Cross namespace references from Routes #711
Conversation
@robscott: GitHub didn't allow me to request PR reviews from the following users: stevesloka. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b8f0e24
to
044535b
Compare
name: bar | ||
namespace: bar | ||
spec: | ||
from: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend adding a spec.rules[]
, with each rule having a from
and to
.
That allows for defining multiple policies in a single resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think due to the additive nature of this API I'd rather have multiple resources than one large resource, but open to ideas. I'm concerned that a list of lists would become hard to manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to encourage people to have more, smaller objects rather than less, larger objects as well.
I note that the from
and two
stanzas are slices here though, but we don't have any discussion on how they should be combined. I'll add more here on the structs themselves.
|
||
* Conceptually similar to NetworkPolicy. | ||
* A separate resource enables admins to restrict who can allow cross namespace | ||
references. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No action required.
Do we need any specific commentary on our Security model that might be useful in future when we look back on this GEP? I don't think so. Food for thought for you and others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do update anything, it's probably only to point out something like "the purpose of this is to allow the person who controls the referent object the requirement to accept that reference, explicitly, with the intent of making cross-namespace references more secure by default."
044535b
to
8d7c454
Compare
} | ||
|
||
// ReferencePolicyTo describes trusted kinds. | ||
type ReferencePolicyTo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a policy per-service will likely be a bottleneck in some scenarios. For example, if we have an egress proxy, and have an external-namespace
. I probably just want to say from: egress-gateway, to: any service in external-namespace
. I am not sure that is re-presentable here, without 1 ReferencePolicy per service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly would be from: anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both to
and from
structs don't have names. This just represents trust from resources of type a in namespace foo to resources of type b within the same namespace as the ReferencePolicy. So in this case I think the to: any service in external-namespace
is quite straightforward.
On the other hand from: anywhere
is intentionally not allowed. I'm not entirely convinced it's a use case we want to encourage, but if it were, I think using a namespace selector instead of a string would be the appropriate way to do it. That does feel like a potential extension point, but I'm pretty hesitant to start with it unless there are compelling use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think one of the key parts about this proposal is that it allows access from a Group/Kind and namespace to a Group/Kind and namespace (where the to
namespace is always "the namespace the object is in".) I think that if we want to make it more flexible than "named namespace", it should be a label selector across namespaces only.
As it stands, we can do "Allow from a group of things in a list of namespaces" using these structs, since from
is a list.
### Benefits | ||
|
||
* Conceptually similar to NetworkPolicy. | ||
* A separate resource enables admins to restrict who can allow cross namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example of when this would actually be useful? I am having a hard time thinking of a scenario where I would want to disallow a namespace from exposing themselves to other namespaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is conceptually similar to organizations who want to restrict the namespaces that can create Services of type LB and use quotas to achieve that. It's possible that you want users to be able to create Services without allowing them to expose them externally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use case for not allowing things to expose themselves to other namespaces is any cluster where a namespace admin for one namespace is not totally trusted. In that case, allowing inbound references is a critical step in privilege escalation (another reason to only allow this interaction in one direction).
For example, I've worked before on CI clusters, where each namespace ran a step in a CI pipeline. Now, CI pipelines are essentaily remote-code-execution-as-a-service, so those workloads are very untrusted. Would someone here be able to create a Service record? Possibly, and if so, it's almost certainly good to ensure that they cannot create cross-namespace references to their namespace.
This is a little contrived, I know, but I think the principle of "relatively untrusted namespace owner" is a pretty sound usecase for this overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I am not sure I understand. Can you let me know where I am going wrong? This is my understanding:
Lets say we have a namespace ingress
, which contains a Gateway. We have a namespace untrusted
, which has a Service.
I think the statement here is we are trying to avoid letting untrusted
expose their service through ingress
' LoadBalancer. We are NOT trying to restrict untrusted
from creating an HTTPRoute though (that is independent of this and driven by Gateway/Route selection agreement); rather, we are trying to restrict ingress
from creating an HTTPRoute referencing untrusted
.
So by restricting untrusted
from creating a ReferencePolicy, we think we have achieved this. In reality, the ingress
namespace can just point to something in its own namespace that forwards to untrusted
(there are many ways to do this, for simplicity lets just say they set up a dumb TCP proxy that forwards all requests to untrusted
). As a result, I would say that the use of restrictive RBAC policy on ReferencePolicy does not achieve the goal of not allowing a Service to expose itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of the ReferencePolicy is to require the owner of the untrusted
namespace to accept incoming references. If they can't create ReferencePolicy objects, then they can't accept incoming references, and so say, the admin can create a more-open HTTPRoute that may reference a lot of Services, but only allow some to complete the two-way handshake.
I agree that it's a bit convoluted, but the idea here is to put the control over incoming references into the hands of the person who wons the referent. The main advantage of doing this with a separate object is that it allows us to apply this pattern to things whose spec we can't change (like Service), while still keeping that control in the hands of the object's owner. I think that the RBAC advantage of having a separate object is a minor one at best compared to that, but it may prove useful to people who are extremely careful with their RBAC policies.
I probably overemphasised this usecase previously, sorry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of the ReferencePolicy is to require the owner of the untrusted namespace to accept incoming references. If they can't create ReferencePolicy objects, then they can't accept incoming references, and so say, the admin can create a more-open HTTPRoute that may reference a lot of Services, but only allow some to complete the two-way handshake.
I think I understand what you are discussing now, but I don't think its a good idea. The fact its possible to do this with the current design is no problem - obviously we cannot stop users from doing obscure things, but I wouldn't want to recommend this.
Instead of the admin "handshaking" every namespace, but then restricting their ability to handshake back, it seems like they should just not "handshake" in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be getting a bit side tracked here though 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think you make a fair point, that it's a pretty unlikely scenario that we already provide betters tools to deal with. But I think that having the ability to RBAC the ReferencePolicy object separately is an advantage, just not that big of one.
site-src/geps/gep-709.md
Outdated
cross-namespace references. | ||
* The implementation clearly documents that ReferencePolicy is not honored. | ||
|
||
This exception is very unlikely to apply to any ingress implementations of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally agree, but this could be plausible in a single-tenant ingress. For example ingress
namespace has all Gateways and Routes for the ingress. Some routes reference other namespace. There is no security risk here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is there no security risk in that scenario? Would it be possible the implementation to know that it was always going to be deployed in that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing this a bit more, I think there are some instances where ingress implementations deployed in this way could potentially deserve an exception. Fundamentally the distinction is if the implementation would be subject to some other cross-namespace restrictions such as NetworkPolicy. I'll update this section to be a bit more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested a wording update here, I think that if we make that change, it covers this usecase already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely reworded this section, PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this API, and think it's a great evolution of the earlier work.
name: bar | ||
namespace: bar | ||
spec: | ||
from: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to encourage people to have more, smaller objects rather than less, larger objects as well.
I note that the from
and two
stanzas are slices here though, but we don't have any discussion on how they should be combined. I'll add more here on the structs themselves.
} | ||
|
||
// ReferencePolicyTo describes trusted kinds. | ||
type ReferencePolicyTo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think one of the key parts about this proposal is that it allows access from a Group/Kind and namespace to a Group/Kind and namespace (where the to
namespace is always "the namespace the object is in".) I think that if we want to make it more flexible than "named namespace", it should be a label selector across namespaces only.
As it stands, we can do "Allow from a group of things in a list of namespaces" using these structs, since from
is a list.
// | ||
// +kubebuilder:validation:MinLength=1 | ||
// +kubebuilder:validation:MaxLength=253 | ||
Group string `json:"group"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because all three of these values are required, each from
reference must fully specify a group, kind, and namespace. This is nice for implementers because it's very straightforward, but will this be flexible enough, or should we consider label selectors here?
The downside of including label selectors is that you then need two sets of fields that are both optional, so the zero values of the fields and slices of this struct really matter. This way is much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been tempted to introduce selectors here but so far have avoided it due to the increased complexity they add. In v1alpha1 I think we have had a few too many selectors and it became difficult to keep track of everything. Direct references can still be confusing, but I think they're generally easier to understand. One of my primary concerns with selectors is that their default is so open. Interpreted literally, an empty selector would accept references from all namespaces.
So all of that to say that I agree with you. If we really need selectors, we can always consider adding them in the future.
### Benefits | ||
|
||
* Conceptually similar to NetworkPolicy. | ||
* A separate resource enables admins to restrict who can allow cross namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use case for not allowing things to expose themselves to other namespaces is any cluster where a namespace admin for one namespace is not totally trusted. In that case, allowing inbound references is a critical step in privilege escalation (another reason to only allow this interaction in one direction).
For example, I've worked before on CI clusters, where each namespace ran a step in a CI pipeline. Now, CI pipelines are essentaily remote-code-execution-as-a-service, so those workloads are very untrusted. Would someone here be able to create a Service record? Possibly, and if so, it's almost certainly good to ensure that they cannot create cross-namespace references to their namespace.
This is a little contrived, I know, but I think the principle of "relatively untrusted namespace owner" is a pretty sound usecase for this overall.
|
||
* Conceptually similar to NetworkPolicy. | ||
* A separate resource enables admins to restrict who can allow cross namespace | ||
references. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do update anything, it's probably only to point out something like "the purpose of this is to allow the person who controls the referent object the requirement to accept that reference, explicitly, with the intent of making cross-namespace references more secure by default."
site-src/geps/gep-709.md
Outdated
ReferencePolicy. This should only be done if: | ||
* Other mechanisms like NetworkPolicy can be used to effectively limit | ||
cross-namespace references. | ||
* The implementation clearly documents that ReferencePolicy is not honored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small change to make this clearer:
ReferencePolicy. This should only be done if: | |
* Other mechanisms like NetworkPolicy can be used to effectively limit | |
cross-namespace references. | |
* The implementation clearly documents that ReferencePolicy is not honored. | |
ReferencePolicy. This may only be done if: | |
* Other mechanisms like NetworkPolicy are used to effectively limit | |
cross-namespace references. | |
* The implementation must clearly document that ReferencePolicy is not honored. |
I think that "may" is okay here because we are following it up with clarifying "must" clauses straight afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAY, MUST, SHOULD should be capitalized so it's obvious we are using them in LOUD RFC TALK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a great way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the CVE is live, I've reworked this entire section, including "LOUD RFC TALK" :), PTAL.
site-src/geps/gep-709.md
Outdated
cross-namespace references. | ||
* The implementation clearly documents that ReferencePolicy is not honored. | ||
|
||
This exception is very unlikely to apply to any ingress implementations of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested a wording update here, I think that if we make that change, it covers this usecase already.
site-src/geps/gep-709.md
Outdated
|
||
## TLDR | ||
|
||
This GEP attempts to tackle both cross namespace forwarding and route inclusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like we are actually defining route inclusion in this GEP.
We should probably say that this is: prepare the API for route inclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW if there are references, it would be good to include to define what route inclusion means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: bar | ||
namespace: bar | ||
--- | ||
kind: ReferencePolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to call this "InboundReferencePolicy" or something make the direction obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. Any thoughts on this name change @youngnick @hbagdi @jpeach @howardjohn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with it, although it's a little long. I had discounted it before because I don't see us adding an OutboundReferencePolicy
at any point, so I didn't think the disambiguation was necessary. But I don't object if others think it makes things clearer, I'm a little deep into this design to be able to have a good idea of how the name looks at first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ReferencePolicy is fine. (per discussion)
* Issue URL: https://github.com/kubernetes-sigs/gateway-api/issues/709 | ||
* Status: Implementable | ||
|
||
## TLDR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we say very explicitly what fields and kinds are intended to be covered as part of "core" behavior in this proposal?
Also do we say that implementations may cover more Kinds and fields, but that is Custom behavior and there is no conformance there. (is this a good idea?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this to the proposed spec. I think non-core kinds would fall under extended conformance since the expected behavior is clear, it's just a difference of which types we expect all implementations to support.
Thanks for all the great feedback on this! I think I've responded to all the feedback now, let me know if I missed anything. I just pushed a round of updates that should get this closer. It also includes a reference to the relevant CVE now: kubernetes/kubernetes#103675. |
|
||
## ReferencePolicy | ||
|
||
Anytime we allow crossing a namespace boundary, we need to be very cautious. To |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[very cautious](cite CVE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I'll leave it up to others to make the call as to if InboundReferencePolicy
or ReferencePolicy
is clearer. I don't have a strong enough opinion to have it affect my call. :)
uh, I didn't think my lgtm would be enough to merge this, oops. Sorry. |
doh - rob you forgot to hold it |
No worries, that's on me, meant to add a hold on these, will open another one with next round of changes. |
I think we were just down to the name of the object anyway, pretty much. |
Yeah agreed, think this one was pretty close |
What type of PR is this?
/kind gep
What this PR does / why we need it:
This adds GEP #709 as the culmination of discussion around ReferencePolicy, cross namespace forwarding, and route inclusion.
Does this PR introduce a user-facing change?:
/cc @youngnick @stevesloka @hbagdi @jpeach @danehans