-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIG-NETWORK v1alpha2 Review #780
SIG-NETWORK v1alpha2 Review #780
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// | ||
// +kubebuilder:validation:MinLength=1 | ||
// +kubebuilder:validation:MaxLength=253 | ||
GatewayClassName string `json:"gatewayClassName"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a direct copy from Ingress API. Since GatewayClass is a cluster-scoped resource, the name should theoretically be unique within a cluster so I don't think duplication would be an issue. What would object name look like here?
// Support: Implementation-specific | ||
// | ||
// +optional | ||
Options map[string]string `json:"options,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the need for this. But we need to be very clear on how this field is validated. max key count. max key length. max value length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added max key count in #772, I think the only way to accomplish max key or value length is to use a custom type (which we probably need to do).
// | ||
// +optional | ||
// +kubebuilder:validation:MaxItems=8 | ||
Kinds []RouteGroupKind `json:"kinds,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all routes defined in the target namespace granted that matches the namespace specification same/selector etc
will be automatically bound to this listener?
Is this how we expect users to compartmentalize routes.. by namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Gateway side we're just describing the kinds/namespaces of Routes of that may be attached to a Gateway. If nothing is specified, Routes in the same namespace with a matching protocol are trusted to attach to the Gateway. So this is all about defining where Routes can attach from.
The primary attachment mechanism is from Routes - they directly reference the Gateways they want to attach to. A Route cannot be attached to a Gateway without directly referencing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is functional. It is not what usually people expect. The current un-written-rules is using selectors (and netpol introduced namespace selectors). usually the parent selects children. Not the other way around. So this needs to be super crisp to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, agree with that. We did start with a selector from Gateways to Routes. This is a major change in v1alpha2. The fundamental issue with selectors is that they gave Gateway admins a false sense of control. Even when they specified a Route selector, it was still the Route owners that had control of Gateways they were attached to, simply by adjusting the labels on their Routes. The only true filtering control Gateway admins had is still in place namespaces, route kinds, and hostnames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's more info/rationale in the GEP that proposed this change: https://gateway-api.sigs.k8s.io/geps/gep-724/.
// Description helps describe a GatewayClass with more details. | ||
// | ||
// | ||
// +kubebuilder:validation:MaxLength=64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more details in just 64 char?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this could likely be longer. We wanted it to be something that could at least theoretically be displayed with kubectl wide table output which limits the possible length. Most decisions around validation length have so far been taken with the idea that we can increase minimums in the future much easier than we can decrease them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any similar precedent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @jpeach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any similar precedent?
Not that I know of. It would of course be trivial to port back to IngressClass if there were interest.
// same port, subject to the Listener compatibility rules. | ||
// | ||
// Support: Core | ||
Port PortNumber `json:"port"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm that (along with .HostName) limits the entire stack to L7 style proto. Is this a non-goal?
What if i want a gateway for IP load balancing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, are you suggesting that we should have the ability to listen on all ports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am asking if the scope of the api is L7 protocols only?
Let us say i have one of those industry standards protos. But they are not as common as HTTP. maybe MQTT or similar. I want to build an ingress/reverse proxy for. Or to put more simply. I want to write some sort of exotic packet processing and route based on some logic. will i be able to use this apis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're hoping that yes, this should be extendable for other APIs, but we are focussing right now on L7 HTTP use cases, with some thought put to L4 TCP and UDP use cases, along with the does-not-fit-cleanly-into-OSI-layer TLS use cases.
I'm in the process of doing an update to the concepts doc that may make this clearer, here is the relevant bit that's going to end up in there:
- HTTPRoute is for multiplexing HTTP or terminated HTTPS connections onto a single port or set of ports. It's intended for use in cases where you want to inspect the HTTP stream and use HTTP-level data for either routing or modification. (An example here is using HTTP Headers for routing, or modifying them in-flight).
- TLSRoute is for multiplexing TLS connections, discriminated via SNI, onto a single port or set or ports. It's intended for where you want to use the SNI as the main routing method, and are not interested in properties of the underlying connection (such as whether or not it is HTTP).
- TCPRoute (and UDPRoute) are intended for use as a mapping between a single port or set of ports and a single backend. In this case, there is no discriminator you can use to choose different backends on the same port, so each TCPRoute really needs a different port on the listener (in general, anyway).
This is what my thinking is here, anyway. In effect, in the order listed, the objects are different in terms of how deeply into the OSI layers of the connection they look for routing information. TCPRoute and UDPRoute consider only basic OSI layer 4 information, TLSRoute uses properties of the TLS connection only, and HTTPRoute uses all the panoply of properties that HTTP has available.
While I can totally see that a Gateway could be used to describe an IPRoute, GRETunnelRoute, or something like that, we have put zero thought into the design for those things so far.
// conformance. | ||
type ProtocolType string | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably why it is so confusing. To me, that is. We are mixing L7 Protos with protocol semantics such as HTTP vs HTTPs vs TLS
which is a semantic for both TCP and HTTP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed at community meeting, @bowei will work on clarifying docs here. Maybe ListenerType makes more sense than ProtocolType here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ProtocolType
here isn't strictly layered (if you go down that path, you end up with a lot of complexity and semantic wrangling). Think of it as a name for protocol at the layer that you want to deal with it on. For example, "HTTPS", instead of "IPV{4,6} + {TCP,UDP} + {HTTP/1.1,HTTP/2,HTTP/3}".
But because ProtocolType
is not strict, implementations are also free to add their own types, with few constraints about what those types ought to mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This covers validation part only. Turning my attention to apis, again. Stay tuned should finish today.
apis/v1alpha2/tcproute_types.go
Outdated
// If ties still exist across multiple Routes, matching precedence MUST be | ||
// determined in order of the following criteria, continuing on ties: | ||
// | ||
// * The oldest Route based on creation timestamp. For example, a Route with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a user who want to change the precdeance must delete the current older
then create a new rule then add the older
deleted rule? if so, then this is not a good a UX. Let us not infer precedence. User can add weight (if spec then we need to resolve a situation when two rules have the same weight) or even a map[weight] match
where duplicate weight can not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is really not a situation we want to ever end up in. The only reason you should end up in this situation is because of some kind of unfortunate mistake. In that case, we were left with 2 options:
- Try to determine the Route that was attached to the Gateway first. Although this would be ideal, it requires statefulness and this knowledge could easily be lost if a controller restarted.
- Consider this an error state and not support either Route until it was resolved.
- Assume the older Route was attached first. This is not always going to be true, but the behavior is consistent, and it generally fulfills the goal of "prefer not to break things that are working".
If we were to let a user add weight here, how would we do that? Would it have to be centralized on the Gateway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason you should end up in this situation is because of some kind of unfortunate mistake
Is it guaranteed to be a mistake rather than an attack? ie, is it required that Gateways not be shared among users who do not trust each other? (If so, that should be documented as part of ListenerRoutes
.)
If you have to deal with attacks, then I think you need tighter semantics.
Try to determine the Route that was attached to the Gateway first. Although this would be ideal, it requires statefulness and this knowledge could easily be lost if a controller restarted.
ugh, yeah, I missed the fact that "oldest Route" doesn't necessarily mean "first-attached Route" before...
I think "oldest Route" is ok for the "mistake" case, but not for the "attack" case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a good point, we need to document this better. I think the most common scenario for multi namespace Gateways is still going to involve single-namespace listeners. So a foo.com listener trusts routes from the foo namespace and a bar.com listener trusts routes from the bar namespace.
// Value is the value of HTTP Header to be matched. | ||
// | ||
// +kubebuilder:validation:MinLength=1 | ||
// +kubebuilder:validation:MaxLength=4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up: HTTP spec does not define max length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this was a rather arbitrary value based on the goal of having no unbounded fields. I'm hoping this is not going to be an issue for >99.9% of users, but it's always easier to relax validation in the future than make it stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On another note, 4096 seems excessive. Note that we only do matching based on these headers. 4096 seems to shoot way past the practical limit if not the technical limit.
// same port, subject to the Listener compatibility rules. | ||
// | ||
// Support: Core | ||
Port PortNumber `json:"port"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am asking if the scope of the api is L7 protocols only?
Let us say i have one of those industry standards protos. But they are not as common as HTTP. maybe MQTT or similar. I want to build an ingress/reverse proxy for. Or to put more simply. I want to write some sort of exotic packet processing and route based on some logic. will i be able to use this apis?
// group of Listeners into a single Listener if the implementation | ||
// determines that the Listeners in the group are "compatible". An | ||
// implementation MAY also group together and collapse compatible | ||
// Listeners belonging to different Gateways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation of Listeners
is very confusing. For instance, you introduce the possibility of grouping Listeners by Port here, but you don't ever explicitly state why an implementation would or wouldn't want to do that. (You never actually say that you can't have two Listeners on the same Port if you don't do grouping.)
Then you talk about "collaps[ing] each group of Listeners into a single Listener", which, the first time I read it, sounded like you meant that the implementation might patch the Gateway
object itself. Except that you can't actually collapse multiple Listener
s into a single Listener
anyway, because a single Listener
can only have one Name
... it's mixing up API concepts with abstract model/internal implementation concepts...
The line "Each listener in a Gateway must have a unique combination of Hostname, Port, and Protocol" from the Listener
documentation belongs here, not there, because it's a fact about the Listeners
field, not a fact about individual Listener
objects.
Then having said that, you can go on to explain the conditions under which you're allowed to have two Listeners with the same Port, and what that means / how it's interpreted. ie, you can only have multiple listeners on the same Port
if it would be possible for the implementation to distinguish them by Name
and/or Protocol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, the conditions under which listeners can be combined into a single proxy configuration are implementation-defined. We do give guidance just below here on how to determine listener compatibility:
Either each Listener within the group specifies the “HTTP” Protocol or each Listener within the group specifies either the “HTTPS” or “TLS” Protocol.
Each Listener within the group specifies a Hostname that is unique within the group.
As a special case, one Listener within a group may omit Hostname, in which case this Listener matches when no other Listener matches.
Multiple listeners on the same port must be distinguished by hostname (the mechanism depends on the protocol).
If the implementation does collapse compatible Listeners, the hostname provided in the incoming client request MUST be matched to a Listener to find the correct set of Routes. The incoming hostname MUST be matched using the Hostname field for each Listener in order of most to least specific. That is, exact matches must be processed before wildcard matches.
// filters are specified that would result in a response being sent, the | ||
// underlying implementation must actively reject request attempts to this | ||
// backend, by rejecting the connection or returning a 503 status code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, copied from HTTP routes and not rewritten: "filters", "503 status code".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rationale was the TLSRoute could be used in conjunction with TCP or HTTP here, thus "rejecting the connection" or "returning a 503 status code" respectively. @howardjohn can probably confirm if I got that rationale correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
// same port, subject to the Listener compatibility rules. | ||
// | ||
// Support: Core | ||
Port PortNumber `json:"port"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed a bit:
We need to allow some form of "all ports" support. Suggest trying this as
Port Ports
type Ports struct {
// One of...
Port *PortNum
AllPorts *bool
}
Just leave room for all ports. Or ranges. Or who knows what else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - should this be required? Why can't I leave it blank and let the impl fill it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The port could default for "HTTP" and "HTTPS" protocols, but not for others. Would it be strange if this was only defaulted sometimes?
// Support: Extended | ||
// | ||
// +optional | ||
// +kubebuilder:default=IPAddress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the value not need quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial. I made it through gateway_types.go
I'm keen to see how LITTLE we can demand people specify. The listeners seem easy to infer for many (e.g. cloud) LB impls.
// | ||
// Known condition types are: | ||
// | ||
// * "Scheduled" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not Pending? That seems to be used in other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll defer to @jpeach on this one. To clarify, are you suggesting we add a new "Pending" condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original proposal this was "Pending" with negative polarity. The subsequent WG discussion reached consensus on Scheduled with positive polarity for this one.
// same port, subject to the Listener compatibility rules. | ||
// | ||
// Support: Core | ||
Port PortNumber `json:"port"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - should this be required? Why can't I leave it blank and let the impl fill it in?
// applied at both the TLS and HTTP protocol layers. | ||
// | ||
// Support: Core | ||
Protocol ProtocolType `json:"protocol"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't I leave this blank and let the impl fill it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most implementations will support multiple protocols, and in many cases it will be ambiguous which protocol should be used if it is not user-specified. For example, should the interpreted protocol start as HTTP if certRef is empty and then change to HTTPS when a cert is provided? For L7 it's likely sufficient to assume protocols if port is 80 or 443, but what if it's not? For L4 implementations, I think it would be especially difficult to determine the appropriate protocol. Requiring protocol to be specified also enables us to provide much clearer validation.
I think Gateway is already a pretty simple resource (only name, protocol, and port are required for each listener), I'm not sure there's sufficient value to make this optional.
// Controllers may raise this condition with other reasons, | ||
// but should prefer to use the reasons listed above to improve | ||
// interoperability. | ||
ListenerConditionConflicted ListenerConditionType = "Conflicted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Conflicted" sort of anthropomorphizes the gateway. Should it go to prom with Alice or with Becky? It's so conflicted...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha condition names are tough. I think this is an important condition to represent, but we couldn't find a better/clearer name to represent this state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it thru gatewayclass_types.go
// Description helps describe a GatewayClass with more details. | ||
// | ||
// | ||
// +kubebuilder:validation:MaxLength=64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any similar precedent?
apis/v1alpha2/gatewayclass_types.go
Outdated
// | ||
// This condition defaults to False, and MUST be set by a controller when it | ||
// sees a GatewayClass using its controller string. The status of this | ||
// condition MUST be set to true if the controller will support provisioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this FOR? Can I really make any decisions based on this? What happens if the controller is unloaded from the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that this is only really useful when the controller is running. I think that same argument is true for most of status fields though. I do think it's useful to know that a controller has actually picked up a GatewayClass you've created.
Variety of fixes in response to feedback on #780
apis/v1alpha2/gateway_types.go
Outdated
// Support: Implementation-specific (Other resource types) | ||
// | ||
// +optional | ||
CertificateRef *SecretObjectReference `json:"certificateRef,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked following on slack but wanted to add here too: Moving TLS certificate configuration from HTTPRoute
to Gateway
seems to disallow the self-service use case, where application developer manages the TLS certificates. This use case that existed in v1alpha1
, and exists also in Ingress
. It would imply that defining a secured virtualhost becomes a split responsibility between application developer and cluster operator.
Over at slack, we discussed that there could be a controller that automatically constructs listeners
with TLS secret references from application developer, but I guess the self-service use case would not be part of the Gateway API then anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end of the day, including TLS details in HTTPRoute was always just a shortcut way of bringing the TLS config to the Listener anyway, and it introduced a lot of complexity into every implementation of the API. We discussed this on the community meeting and agreed that it's better to separate the complexity of ensuring that the right secrets are bound to the right hostnames on the right ports into a separate bundle of code (the controller you mentioned).
This means that yes, there will not be full support for the self-service use case in the base API. But we really think that doing it with a separate controller allows for:
- keeping complexity separate and only for people who need it
- the security concerns about dynamically mapping certificates to a host can be managed in that controller as well (and there are quite a few).
Does this mean that it is not and will never be part of the API? No. If there's enough demand for this, could the dynamic controller be developed here? Yes. Do I think that's likely? No.
My personal opinion (not that of the project) is that the self-service use case is too magic, and a problem waiting to happen, just like it is in Ingress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @youngnick, I can understand it is cleaner like this and it matches the reality of implementations. But operational-wise, I would have expected the self-service use case to be a common one for multi-tenant clusters and imagined that involving cluster-admin (to secure virtualhost provisioning) could be a problem. I'd be curious to know how the certificate provisioning would be handled in e.g. OpenShift if it were to adopt Gateway API for multi-tenant scenario, maybe @danehans has insights?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR is already massive and slow, I'd suggest redirecting this discussion to our related tracking issue: #763. I'll continue this thread there.
5bb78f1
to
1fd1953
Compare
I think sig-network review is effectively complete at this point, so I'm going to close this out. Let me know if I've missed anything. |
What type of PR is this?
/kind api-change
What this PR does / why we need it:
This is to facilitate the sig-network review of v1alpha2 (#716). It represents a review of the entirety of the
apis/v1alpha2
directory from master at this point in time. Updates to the master branch will require manual updates to this PR as well. I've tried alternatives like .gitignore and .gitattributes that would not require rebasing (robscott#1), but none have worked well for me.References:
/hold
/cc @andrewsykim @danwinship @khenidak @thockin