Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand and clarify Listeners definition #2288

Merged
merged 9 commits into from
Sep 25, 2023

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Aug 11, 2023

What type of PR is this?

/kind documentation

What this PR does / why we need it:

From the attached issue, Listeners was unclear regarding what "compatible" meant, in particular that it was more whether the implementation could serve that Listener set on the same addresses while being able to match inbound requests to a single Listener only.

Updates the GatewaySpec.Listeners documentation to:

  • Separate the definition of and examples for "compatible". Previously "compatible" wasn't clearly defined: we said what implmentations could do with compatible Listeners and provided an example that might be compatible depending on implmentation capabilities. The definition wasn't formal, and was only implied by the surrounding language.
  • Move the "same port, different hostname" rule into the definition of "compatible".
  • Reference the Addresses MUST in the "compatible" definition.
  • Explicitly state that the examples are not normative and may not be compatible depending on implementation capabilities.
  • Split the HTTP and HTTPS/TLS examples and add a mixed TCP/UDP example.
  • Reword the Hostname matching MUST language to be specific to Hostname matching. Previously, it applied to all same-Port Listeners, and a narrow and pedantic interpretation would preclude TCP and UDP Listeners on the same port unless they used hostname matching, which they can't (because their routes don't have hostnames).
  • Expand the Condition usage to apply to all compatibility cases (instead of just same port) and add language about the Gateway-level Condition.
  • Use "collapsing" language only in reference to using the same address set across Gateways. Collapsing Listeners within a Gateway didn't really make sense, as those Listeners remained distinct in configuration and status. For example, an HTTPS Listener and TLS Listener on the same port weren't collapsed in any meaningful sense--HTTPRoutes and TLSRoutes would still contribute to the appropriate Listener's attachedRoutes separately, not to a hypothetical single combined Listener.

There were two remaining areas of confusion that I didn't have perfect answers to. I've covered those in a separate comment.

Which issue(s) this PR fixes:

Fixes #2274

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/documentation Categorizes issue or PR as related to documentation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 11, 2023
@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 11, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @rainest. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rainest
Copy link
Contributor Author

rainest commented Aug 11, 2023

I found two areas during self-review that probably warrant some further discussion:

Partial acceptance decision authority is unclear

Current language for ListenersNotValid indicates that implementations can accept Gateways with problem Listeners. This implies implementations can accept a subset of Listeners that do not have issues or reject the Gateway entirely, but does not indicate which Listener subset implementations should accept.

The changes in this PR indicate that implementations can accept Gateways with some incompatible Listeners, but do not say which Listeners to accept, leaving it open to "partial Listener set that contains no incompatible Listeners".

This feels intentionally ambiguous to a degree. I'm not sure if we'd want to clarify further which cases should result in total rejection. If we don't, should we explicitly state that the decision of whether to fully or partially reject is up to the implementation?

New language broadens compatible Listeners in weird ways

The original language was generally narrower and could be read as "Listeners can only share ports if they use separate hostnames", with an implied separation between TCP and UDP ports.

The new language still mentions same-Port Hostname separation, with the caveat that this is allowed if the implementation can still successfully discriminate. This tries to avoid a generic broad "if you can consistently match a request to exactly one Listener, Listeners are compatible" rule, since in practice we do expect most same Port (and same transport, although that's not explicit anywhere) Listeners to use Hostname to match requests, as AFAIK there aren't other Listener fields you can use for that purpose.

That language does allow for HTTP, HTTPS, TCP, and TLS Listeners with the same Port and Hostname if you can successfully discriminate between requests to each, which the old language arguably did not. DPI-based implementations can probably do this, and while I'd probably advise against it in practice, I don't know that we'd have a reason for the spec to limit such configurations--maybe that it would allow Gateways that are only fully acceptable to a very narrow set of implementations.

@shaneutt shaneutt added this to the v1.0.0 milestone Aug 11, 2023
@shaneutt shaneutt added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 11, 2023
Explicitly define Listener compatibility as definitively routable and
allowed on the same address by the implementation.

Narrow language about collapsing compatible Listeners to mean assigning
multiple Gateways the same addresses if their Listeners are compatible,

Expand Listeners examples to include mixed TCP and UDP and indicate that
they are not compatible if an implementation does not support that
combination.

Add language about Gateway-level conditions for incompatible Listeners.
// For example, the following Listener scenarios may be compatible
// depending on implementation capabilities:
//
// 1. Multiple Listeners with the same Port that all use the "HTTP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example is a core feature

I wonder if "may be compatible" is applicable to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatibility is weird because it ends up being about implementation capabilities rather than rules in the spec. If your implementation isn't compatible with something the core requires, it just can't implement the spec.

We could bring this further out of core by making it something outside the language above and mentioning specific ports, e.g. "Multiple Listeners with Port "9999" that all use the "HTTP" protocol (and similar for the other examples, just to make them consistent).

We could also separate examples that are within core or outside it, but it wouldn't be great for brevity.

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rainest!

Comment on lines 88 to 95
// A Gateway's Listeners are considered "compatible" if:
//
// 1. The implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned
// addresses.
// 2. No Listeners sharing the same Port share the same Hostname value,
// including the empty value, if this would prevent the implementation
// from matching an inbound request to a specific Listener.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One advantage of the previous wording here was that it was an example of what an implementation might consider compatible. In this case, the wording might prevent support for multiple protocols on the same port, which at least some vendors can support. (I don't think this explicitly prevents it, but it's not obvious if it would be compatible).

Copy link
Contributor Author

@rainest rainest Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not think the examples below cover that? They do intentionally include that case and indicate these are compatible if you can support it.

From the second point in my earlier self-review, would we want to instead use broader, but more direct language? "The implementation can unambiguously match a request to a single Listener." instead, for example. That's ultimately what we're going for, but quite abstract, so maybe harder to understand.

There is Hostname-specific language later, so we can omit that here if we want:

Implementations using the Hostname value to select between same-Port Listeners MUST match inbound request hostnames from the most specific to least specific...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over this again, it may be better to introduce the concept of distinct first, and then use that as the basis for compatible - I think that it makes what we're doing easier to understand.

Suggested change
// A Gateway's Listeners are considered "compatible" if:
//
// 1. The implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned
// addresses.
// 2. No Listeners sharing the same Port share the same Hostname value,
// including the empty value, if this would prevent the implementation
// from matching an inbound request to a specific Listener.
// A Gateway's Listeners must be _distinct_.
//
// The key information that is used to determine if a set of Listeners is distinct
// differs based on if the protocol considers the `hostname` field relevant.
//
// For protocols `HTTP`, `HTTPS`, and `TLS`, Listeners must be distinct in terms of
// having a unique combination of `port`, `protocol`, and `hostname`. The empty
// hostname is also a unique value, so there can only be one Listener with an
// empty hostname in a set of Listeners when using these protocols.
// (This Listener then functions as a fallback, where any undefined hostname will
// match).
//
// For protocols `TCP` and `UDP`, Listeners must be distinct in terms of having
// a unique combination of `port` and `protocol` only. Hostname fields in these
// Listeners must be ignored.
//
// Implementation-specific protocols MAY choose if `hostname` is relevant to their
// protocol or not, and use either of these rules on that basis.
//
// A Gateway's Listeners are considered "compatible" if all the Listeners are distinct
// _and_ the implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned addresses.

I think that this sort of language will make the later changes easier as well, in that it makes it more clear why combinations may be incompatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we indeed want to codify specific compatibility cases or do we want to leave most of the extended space open to implementation capabilities? If we do want to codify the hostname/port/protocol relationships, we'd use something like @youngnick's suggestion, dispense with the separate examples, and just add "unless the implementation cannot serve otherwise compatible Listeners on the same addresses" as a final caveat.

I like the broad "what the implementation can route" version for its brevity, and think the select non-normative examples after ground that rule in more practical terms, without having to exhaustively list all cases.

Exhaustively listing compatible cases feels like it'd be more difficult to expand. We have to cover an N-dimensional matrix with complex rules (sometimes a dimension isn't relevant), so adding a new Listener field would create lots of cases.

We're hindered somewhat by needing to place the spec in struct comments--the more detailed rules would probably fit better in a breakout page, but AFAIK we don't use those in the reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I think this is a case where it's better to be more specific for the protocols defined in the spec - I tried to leave some space for the use of implementation-specific protocols, but I don't anticipate us adding many more protocol options, or other options like hostname.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to write an exhaustive set of rules from the angle of Listener properties is proving quite tricky. For example,

For protocols HTTP, HTTPS, and TLS, Listeners must be distinct in terms of having a unique combination of port, protocol, and hostname.

probably breaks when you have HTTP:443:example.com and HTTPS:443:example.com. Those are unique (they differ by protocol) but I expect most implementations couldn't actually handle it, and it'd violate a rule (insofar as the example is halfway a rule) from the original:

Either each Listener [using the same port] specifies the “HTTP” Protocol or each Listener within the group specifies either the “HTTPS” or “TLS” Protocol.

Trying to hew closer to the original and focus around sharing Port values gets confusing because transport (or the use of TLS) is implied but not part of the spec.

Ultimately, do we have goals for compatibility other than ensuring that an implementation can serve Listeners on the same address and match a request to exactly one Listener? If we do, and expect that compatibility will vary across implementations anyway, do we have a reason not to make that the actual rule?

Again, I understand the value of having practical examples--that's why this revision still has them, after this section. Do those not provide the clarity around common cases we expect to see?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the example you gave makes sense, but I think that in the suggestion I made, that comes under the "implementations couldn't actually serve this" rule.

You're right that, at its heart, the rule is "implementations can serve all Listeners on the same address, and match a request to exactly one Listener". We can make that the rule, but I think that without something like what I wrote, we will end up with big incompatibilities between implementations. Just having examples is, in my experience, not enough.

To put this another way, for the included Protocols, I think that we should be as specific as we can manage, with a rule like "if you can't serve a distinct set of listeners, then it's not compatible" to cover the other edge cases. The idea behind adding "distinct" as a thing is to try to make it clear what we mean by "match a request to only one Listener" - because it really needs to be "match a request to only one Listener, with no re-entry to processing if that Listener's Routes don't work out." (Since that's what started this conversation to begin with).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Status quo is that we already only have examples (and effectively only one at that).

What incompatibilities do we expect in practice, and how bad will those be?

I grant that lack of upper bounds may allow some implementation to say "we do deep packet inspection, you can have HTTP, HTTPS, and TLS Listeners with the same Hostname, and a no-Hostname TCP Listener all on the same Port", and that this won't be portable. I think that's fine so long as there's a defined way for an implementation to report that they can't support that configuration, and expect that demand for it isn't so great that mixed support isn't going to break the ecosystem.

If there are areas where we're concerned about compatibility, we should be adding more SHOULD lower bounds after the existing MUSTs.

I could write something to the effect of:

+       // HTTPS and TLS Listeners MAY be compatible with each other if they share
+       // the same Port, They MUST NOT be compatible with each other if they share
+       // the same Port and same Hostname.
+       //
+       // HTTP Listeners MUST NOT be compatible with TLS, HTTPS, or TCP Listeners
+       // with the same Port. They MUST be compatible with HTTP Listeners that
+       // share the same port if they do not share Hostname values.
+       //
+       // HTTP, HTTPS, TLS, and TCP Listeners MAY be compatible with UDP Listeners
+       // with the same Port.
+       //
+       // Implementations MUST NOT consider the Hostname field for TCP and UDP
+       // Listeners when evaluating compatibility.
+
+      ...

but it's still likely to be incomplete and long despite, and I do think trying to limit the spec complexity is valuable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I thought that's what I was trying to do by saying "Listeners must be distinct", and defining rules for distinct.

I think that this is now at the point where we need to talk about it in the meeting though - since I don't understand how the suggestion I supplied doesn't make the spec clearer while managing the spec complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dd27869 rewords the second rule.

5af8b65 adds cascade matching language as a SHOULD elsewhere. IMO this would be a breaking change if it's a MUST. I'd need to review further to see if there was existing (if unclear) language that implied that, but I know there are at least no existing compatibility tests for it (we'd have failed it if so).

AFAIK we don't have much information from other implementations re their ability to cascade or not--I know it isn't Kong's routing behavior and isn't easily avoided.

The latter also can't be a compatibility rule since it doesn't block accepting both Listeners.

apis/v1beta1/gateway_types.go Outdated Show resolved Hide resolved
Comment on lines +105 to +109
// Implementations MAY choose to still accept a Gateway with conflicted
// Listeners if they accept a partial Listener set that contains no
// incompatible Listeners. They MUST set a "ListenersNotValid" condition
// the Gateway Status when the Gateway contains incompatible Listeners
// whether or not they accept the Gateway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reason we'd want to allow this is if a Gateway was already programmed with compatible listeners and then invalid/incompatible one(s) were added. It gets really tricky to represent this state. I may be remembering wrong here, but I think on GKE we will leave "Programmed" set to "True" with the last generation it was Programmed, but I don't think we have any clear guidance for these kinds of partially valid states in the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//
// Implementations MAY choose to still accept a Gateway with conflicted
// Listeners if they accept a partial Listener set that contains no
// incompatible Listeners. They MUST set a "ListenersNotValid" condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ListenersNotValid is a reason that can be used with "Accepted", generally to set the condition to "False". I'm not sure what we'd recommend in the case where Listeners were not valid and the Gateway was accepted.

Copy link
Contributor Author

@rainest rainest Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We say it can be used when Accepted is True. From the first point in my self-review, this is indeed a confusing case, but probably one that's intentionally ambiguous because we expect it to vary so much between implementations.

If we want to leave that as-is, the change here is just to say (here) that implementations must set this, whereas previously it wasn't a strict requirement, and wasn't obvious unless you looked through the reason comments. Even if it changes, I think we should mention something here.

Changing the current vague guidelines to something more formal would be a significant breaking change. Between that and the expected variance across vendors, my vote would be to defer it to post GA, when we have more practical experience regarding which approaches are in use and their pros and cons.

// depending on implementation capabilities:
//
// 1. Multiple Listeners with the same Port that all use the "HTTP"
// Protocol that all have unique Hostname values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an example, so it may not be critical here, but it might be helpful to define whether *.example.com and foo.example.com would be considered unique here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8a9acfb adds this to the following Hostname precedence section. Do you think that works? It's not explicitly saying they are considered distinct, but it is using them as distinct values in the context where it matters.

apis/v1beta1/gateway_types.go Outdated Show resolved Hide resolved
@rainest rainest requested a review from robscott August 15, 2023 23:18
Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, these changes ended up being pretty extensive, but I think if we change the focus to defining "distinct" Listeners, it will be both more clear, and closer to the original intent.

Comment on lines 88 to 95
// A Gateway's Listeners are considered "compatible" if:
//
// 1. The implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned
// addresses.
// 2. No Listeners sharing the same Port share the same Hostname value,
// including the empty value, if this would prevent the implementation
// from matching an inbound request to a specific Listener.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over this again, it may be better to introduce the concept of distinct first, and then use that as the basis for compatible - I think that it makes what we're doing easier to understand.

Suggested change
// A Gateway's Listeners are considered "compatible" if:
//
// 1. The implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned
// addresses.
// 2. No Listeners sharing the same Port share the same Hostname value,
// including the empty value, if this would prevent the implementation
// from matching an inbound request to a specific Listener.
// A Gateway's Listeners must be _distinct_.
//
// The key information that is used to determine if a set of Listeners is distinct
// differs based on if the protocol considers the `hostname` field relevant.
//
// For protocols `HTTP`, `HTTPS`, and `TLS`, Listeners must be distinct in terms of
// having a unique combination of `port`, `protocol`, and `hostname`. The empty
// hostname is also a unique value, so there can only be one Listener with an
// empty hostname in a set of Listeners when using these protocols.
// (This Listener then functions as a fallback, where any undefined hostname will
// match).
//
// For protocols `TCP` and `UDP`, Listeners must be distinct in terms of having
// a unique combination of `port` and `protocol` only. Hostname fields in these
// Listeners must be ignored.
//
// Implementation-specific protocols MAY choose if `hostname` is relevant to their
// protocol or not, and use either of these rules on that basis.
//
// A Gateway's Listeners are considered "compatible" if all the Listeners are distinct
// _and_ the implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned addresses.

I think that this sort of language will make the later changes easier as well, in that it makes it more clear why combinations may be incompatible.

apis/v1beta1/gateway_types.go Outdated Show resolved Hide resolved
@shaneutt
Copy link
Member

From the community sync today: we seemed to be agreed that we should star working on robust conformance tests which express/test the rules we expect are needed as the next step, and then use what we learn during that development/review cycle to feed back into the specification to hopefully provide a good summary that users can understand without needing to go read conformance tests.

@rainest
Copy link
Contributor Author

rainest commented Aug 29, 2023

robust conformance tests which express/test the rules we expect are needed

@shaneutt more profiles (https://gateway-api.sigs.k8s.io/geps/gep-1709/) than just tests. The idea is that some implementations will fail certain Listener combination tests and that's okay, because they're extended combinations the implementation has opted not to support.

For example, for the cases that came up here, you'd have MixedTransport (for TCP and UDP on the same Gateway) and MixedApplicationSamePort (for same Port HTTPS and TLS). Those could express both positive and negative requirements, such as confirming that you do accept HTTPS and TLS on the same port, but not HTTP and HTTPS on the same port.

I think those lend themselves better to specific combinations of Listener. They're scoped to a limited area of capabilities, so the prose descriptions can focus on that area alone, rather than trying to write a generic set of rules for every possible combination.

That would, however, be at odds with [the expressed desire to focus on route types, though I think the expected implementation variance here is good enough reason to include Listener-focused profiles. It could also potentially result in a lot of fairly narrow profiles.

That says we're looking to revisit it in February 2024. Was there anything we were waiting on before expanding it? It looks like the SIG Arch thread has finished, and I'd expect we can still write sets of tests even if we haven't fully decided on the reporting format.

@shaneutt
Copy link
Member

/cc

@rainest
Copy link
Contributor Author

rainest commented Aug 31, 2023

I had an additional thought while working with the v0.8.0 Listener validation CEL: should we permit otherwise identical Listeners with non-overlapping AllowedRoutes? Those maybe are a case for merging Listeners within a single Gateway (which this PR removes from the language). You could have, for example:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: prod-web
spec:
  gatewayClassName: acme-lb
  listeners:
  - protocol: HTTP
    port: 80
    name: prod-web-gw-1
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            example: prod-1
  - protocol: HTTP
    port: 80
    name: prod-web-gw-2
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            example: prod-2

I'm fairly sure this shouldn't be allowed and you need to instead ensure all namespaces you want match a single selector:

  • If our criteria require matching based on the inbound request and Listener only, there's no way to know what namespace a matching route may be in using the narrow definition of that rule (you cannot extrapolate "routes the Listener matches" from "Listener").
  • This would allow identical HTTPRoutes in separate namespaces that both match the merged Listener, with no way to discriminate between them.

@youngnick
Copy link
Contributor

yes, I agree that you should have all the listeners have distinct sets of matchLabels if they are present - and by "distinct" here I mean that they must have different values in the keys of the matchLabels map.

// matches. For example, `"foo.example.com"` takes precedence over
// `"*.example.com"`, and `"*.example.com"` takes precedence over `""`.
//
// Implementations SHOULD NOT match requests to less specific Listeners if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cascading listener matches make sense for some cases.

Let's consider an example below:

Gateway:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: gateway
spec:
  gatewayClassName: nginx
  listeners:
  - name: example
    port: 80
    protocol: HTTP
    hostname: "*.example.com"
  - name: cafe-http
    port: 80
    protocol: HTTP
    hostname: "cafe.example.com"

HTTPRoutes:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: coffee
spec:
  parentRefs:
  - name: gateway
    sectionName: example
  hostnames:
  - "cafe.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /coffee
    backendRefs:
    - name: coffee
      port: 80
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: tea
spec:
  parentRefs:
  - name: gateway
    sectionName: cafe-http
  hostnames:
  - "*.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /tea
    backendRefs:
    - name: tea
      port: 80

The way we interpret this in NKG is two rules:

  • Hostname: cafe.example.com Path: /tea
  • Hostname: cafe.example.com Path: /coffee

As you can see, although those rules come from HTTPRoutes belonging to different listeners, they end up sharing the same hostname cafe.example.com

However, this goes against "For example, a request for subdomain.example.com/path SHOULD NOT match an HTTPRouteattached to a Listener with Hostname *.example.com if there is a Listener with with Hostname subdomain.example.com, even if no routes on the second Listener match". In my example, a request cafe.example.com/coffee will fail to match the tea HTTPRoute on more specific listener cafe.example.com but will succeed to match on the coffee HTTPRoute of the less specific listener *.example.com.

Perhaps the problem here is that in Gateway API we allow hostname of one listener (like *.example.com) to include the hostname of another listener (like cafe.example.com).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually have the same question for @rainest since I think both NKG and Kong may have similar behavior here.

Copy link
Contributor

@pleshakov pleshakov Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I further tried the example #2288 (comment) on the following implementations (I put the files into the a gist -- https://gist.github.com/pleshakov/9e890e359ee47c4be908f13590ee9377 ):

Implementation curl http://cafe.example.com/coffee curl http://cafe.example.com/tea
Contour ghcr.io/projectcontour/contour:v1.26.0 200 from coffee 200 from tea
Envoy Gateway docker.io/envoyproxy/gateway:latest 200 from coffee 404
GKE gke-l7-global-external-managed, v1.27.3-gke.100 200 from coffee 200 from tea

latest = docker.io/envoyproxy/gateway-dev@sha256:db1e923970985d8d30183e9647c6dbb0fcf93674d1e12d5269c7cc2540154d12

As you can see, in addition to NGINX Kubernetes Gateway, listener cascading also happens on Contour and GKE, so it doesn't not only appear in NGINX-based implementations.

Based on our discussion in the Gateway meeting from Sep 18, it was mentioned that Envoy has one level host header (virtual host) matching, which sounds similar to NGINX, which makes me believe that the cascading behavior is the consequence of configuration method used, rather than data-plane specific behavior (of Envoy or NGINX)

This makes me wonder if we need to define those expectations as MUSTs not SHOULDs and have the conformance tests defined too, providing other data planes could also support that.

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

@robscott I will also do some HTTPS / SNI testing shortly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kong selects certificates independent of any HTTP routing. We bind certificates to a set of hostnames and choose whichever cert is associated with the most specific match for the handshake SNI value. HTTP routing happens after based on the HTTP host header. We don't really have an equivalent of the NGINX server_name directive that effectively selects both a certificate and a set of possible HTTP routes at once.

Our HTTP routes can optionally specify a SNI value to also consider the SNI value for HTTP route selection, but this isn't commonly used--usually only if you have configuration that must evaluate before HTTP request data is available, namely the client cert enforcement mechanism.

Most non-certificate TLS configuration is global. Choosing, for example, a separate set of cipher suites generally requires a separate instance.

I'm unsure if we can support both passthrough and terminate on the same IP+port; if we would be able to, for example, cascade from a Terminate Listener for foo.example.com:9999 to a Passthrough for *.example.com:9999. I think we can request it via configuration, but offhand I don't know that the routing engine would handle it properly.

Copy link
Contributor

@arkodg arkodg Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @pleshakov are you sure you have typed out the sectionName correctly in the HTTPRoute ?
sidenote - i tested this on Envoy Gateway with the latest code and both cases (curl http://cafe.example.com/coffee & http://cafe.example.com/tea) 404

however, here is what you get if you swap the sectionName

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: coffee
spec:
  parentRefs:
  - name: gateway
    sectionName: cafe-http
  hostnames:
  - "cafe.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /coffee
    backendRefs:
    - name: backend
      port: 3000
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: tea
spec:
  parentRefs:
  - name: gateway
    sectionName: example 
  hostnames:
  - "*.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /tea
    backendRefs:
    - name: backend 
      port: 3000
curl -I --verbose --header "Host: cafe.example.com" http://localhost:80/tea
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> HEAD /tea HTTP/1.1
> Host: cafe.example.com
> User-Agent: curl/7.86.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
< date: Tue, 19 Sep 2023 23:53:32 GMT
date: Tue, 19 Sep 2023 23:53:32 GMT
< server: envoy
server: envoy
< transfer-encoding: chunked
transfer-encoding: chunked

< 
* Connection #0 to host localhost left intact
$ curl -I --verbose --header "Host: cafe.example.com" http://localhost:80/coffee
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> HEAD /coffee HTTP/1.1
> Host: cafe.example.com
> User-Agent: curl/7.86.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< content-type: application/json
content-type: application/json
< x-content-type-options: nosniff
x-content-type-options: nosniff
< date: Tue, 19 Sep 2023 23:53:44 GMT
date: Tue, 19 Sep 2023 23:53:44 GMT
< content-length: 516
content-length: 516
< x-envoy-upstream-service-time: 0
x-envoy-upstream-service-time: 0
< server: envoy
server: envoy

< 
* Connection #0 to host localhost left intact

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arkodg

hey @pleshakov are you sure you have typed out the sectionName correctly in the HTTPRoute ?
sidenote -

yep. the idea of this example is for the /coffee HTTPRoute with cafe.example.com hostname to attach to the *.example.com listener and for the /tea HTTPRoute with *.example.com hostname to attach to the cafe.example.com listener.

Most implementations I tested (GKE, Contour, NGINX Gateway) end up creating two routing rules:

  • Hostname: cafe.example.com Path: /tea
  • Hostname: cafe.example.com Path: /coffee

So although those rules come from HTTPRoutes attached to different listeners, they end up sharing the same hostname cafe.example.com of the listener "cafe.example.com"

Envoy Gateway creates only one routing rule:

  • Hostname: cafe.example.com Path: /tea

Copy link
Contributor

@pleshakov pleshakov Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robscott

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

I extended the example to include TLS termination using two certs: CN *.example.com for example listener and CNcafe.example.com for cafe listener -- https://gist.github.com/pleshakov/607bec3a9e617435fce3d9574806a7c4

Below are the results:

Implementation Coffee curl Tea curl Coffee mismatch host header and SNI curl Tea mismatch host header and SNI curl
Contour ghcr.io/projectcontour/contour:v1.26.0 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com error: Connection reset by peer error: Connection reset by peer
Envoy Gateway docker.io/envoyproxy/gateway:latest 404, CN=cafe.example.com 200 from tea, CN=cafe.example.com 200 from coffee, CN=*.example.com 404, CN=*.example.com
GKE gke-l7-global-external-managed, v1.27.3-gke.100 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com 200 from coffee, CN=*.example.com 200 from tea, CN=*.example.com
NGINX Gateway Fabric, edge* 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com error: routines:ST_CONNECT:tlsv1 unrecognized name error: routines:ST_CONNECT:tlsv1 unrecognized name

edge = sha256:c8040a0d968911496123fd75087a42606f2bdc00b7ec4bf27be280e6bf1a3347
latest = docker.io/envoyproxy/gateway-dev@sha256:db1e923970985d8d30183e9647c6dbb0fcf93674d1e12d5269c7cc2540154d12

Note: CN=... means the CN of the TLS cert used by the data plane, as reported by curl

Coffee curl

curl --resolve cafe.example.com:$GW_HTTPS_PORT:$GW_IP https://cafe.example.com:$GW_HTTPS_PORT/coffee --insecure -v

Tea curl

curl --resolve cafe.example.com:$GW_HTTPS_PORT:$GW_IP https://cafe.example.com:$GW_HTTPS_PORT/tea --insecure -v

Coffee mismatch host header and SNI curl

curl --resolve some.example.com:$GW_HTTPS_PORT:$GW_IP https://some.example.com:$GW_HTTPS_PORT/coffee --insecure -v -H "host: cafe.example.com"

Tea mismatch host header and SNI curl

curl --resolve some.example.com:$GW_HTTPS_PORT:$GW_IP https://some.example.com:$GW_HTTPS_PORT/tea --insecure -v -H "host: cafe.example.com"

Looking at the results, the examined implementations at least have same behavior in one column - Tea curl :)

@rainest
Copy link
Contributor Author

rainest commented Sep 13, 2023

@youngnick @robscott is there anything outstanding that should still need changes here? I'd responded to, but not resolved conversations with either edits or explanations as to why I didn't think they were needed; I usually leave resolution up to the asker unless it's something simple like a typo.

@youngnick
Copy link
Contributor

I've been thinking about the right way to respond here, not sure if this is it, but it's all I've got.

I've basically got two things that I don't think are resolved yet.

Firstly, I think that I didn't explain why I wanted to introduce the concept of distinct in my suggested change - it actually comes down to a similar reason to something that @rainest said above:

Compatibility is weird because it ends up being about implementation capabilities rather than rules in the spec. If your implementation isn't compatible with something the core requires, it just can't implement the spec.

I would to see us try to go as far as we can with putting rules into the spec for what constitutes distinct Listeners, that we can then use to talk more easily about what compatible means. distinct is then the spec-level rules about what the spec considers makes Listeners different, then compatible is what an implementation of the spec uses to determine if configuration for any set of Listeners is mergable into a single ruleset, where a flow will be merged against every rule in the set.

That concept is important because, for better or for worse, it has always been intended that the traffic-matching between Listeners and Routes is a two stage process. First you match a Listener, then you look for matching routes. In the case that there's a wildcard hostname, it was definitely not intended that a request could match both (for example, the cafe.example.com and *.example.com Listeners.) That is, having different values at all for hostname has been intended to make the Listeners distinct and unmergeable. We certainly have not done enough to explain that this is the intention, and I'm sorry for that.

The other reason for the filtering to work like this is TLS. The hostname is not only used for header matching once the traffic stream is inspectable, it's also used for SNI matching. This means that allowing cascading hostname matching means that the hostname field will work differently when TLS is used vs when it is not - because the SNI match must be done before any HTTP headers are present (since the TLS handshake is not complete).

What happens when hostnames can be cascade matched and you have a more specific certificate? This is extremely important once we add the ability to require a client certificate to Listeners, because of the following scenario:

Imagine, like in @pleshakov's example, we have two listeners, one with a precise hostname, and one with a wildcard. The precise hostname entry also requires a client certificate because the specific hostname service requires greater security.

If we allow cascading hostname matches, then you can end up inadvertently allowing bypass of the client certificate, because paths that match HTTPRoutes attached to the *.example.com Listener can match cafe.example.com, even though you haven't done a TLS transaction for cafe.example.com using the rules that the cafe.example.com Listener expects. This means that, although the cafe.example.com Listener is requesting client-certificate-secured traffic, it's possible to send a request to cafe.example.com that does not use a client certificate. That seems like it could really, really bite people who are expecting "enable client certificate" to mean "everything matching this hostname requires client certificate".

Okay, I've argued enough here. How can we move forward with this?

Practically, it seems clear that it's much easier for some data planes to implement one way or the other. We don't want to make conformance difficult or impossible for existing implementations at this point, but we also need to try to avoid exactly this sort of subtle-but-important difference in behavior between implementations.

I think that, for now, the best that we can do is to be very explicit that about how traffic matching SHOULD work. I'm sorry to be inflexible about this, but I feel pretty strongly that the two-phase matching mechanism that precludes cascading hostname matches is a much better pattern in the long term, because of the arguments I've made above. So, I believe that we should record explicitly that implementations SHOULD design their traffic matching so that hostname matching is performed in a separate step to other HTTP matching (path, other headers, and so on). This means that implementations SHOULD NOT allow cascading between distinct Listeners (where, again, "distinct" is intended to capture that the Listeners are different at the spec level"). These requirements are only SHOULD rather than MUST because some implementations are already doing this, and we failed to make this clear enough earlier.

With respect to the definition of "distinct", I think that we should actually do this in code - with a helper in the API definitions. This helper would compare two given Listeners and return a boolean that indicates if they are distinct or not. That way, we can make this as specific as we need to, and be confident that there is a very clear definition of what "distinct" means, even if that definition gets long because there are a lot of edge cases. The spec itself can just say "distinct means whatever is encoded in the helper", and not list all of the edge cases in the spec itself.

Again, I'm really sorry to be taking such a hard line stance here, but I am very worried about how heavily this could impact portability between different data planes unless we handle it very carefully.

@shaneutt shaneutt added the release-blocker MUST be completed to complete the milestone label Sep 15, 2023
@shaneutt
Copy link
Member

We discussed this one in the community sync today: this seems to be one of the greater risk factors to releasing v1.0.0-RC1 in the next few weeks, so it seems moving forward with the SHOULD documentation for the HTTP situation might be the best way forward, but we also discussed that we may need to add conformance tests for the HTTPS case.

@rainest
Copy link
Contributor Author

rainest commented Sep 19, 2023

Are there points of concern other than cascade handling/overlapping hostnames? I would move to strike that addition here and move it to a separate issue. It's outside the scope of what I originally wanted to address and can be written independent of it.

I don't think any other changes here would preclude either cascade behavior. If we do know of conflicts, I can revise those points. I'm unfortunately not interested in writing the cascade language--I don't feel I have sufficient information about the existing landscape and broader implications to make a good call and write it well--and will close this PR if that's considered inseparable from the compatibility revisions.

My primary goal here was to clarify how and why implementations should mark Listeners Accepted or not, with a secondary goal of reorganizing content that'd been added piecemeal over time.

Existing language does support overlapping hostnames:

The incoming hostname MUST be matched using the Hostname field for each Listener in order of most to least specific. That is, exact matches must be processed before wildcard matches.

Cascade behavior will not change whether affected Listeners can both be Accepted on the same Gateway. Both Listeners must be Accepted for it to be relevant at all. It does matter if you collapse separate Gateways, but this PR doesn't meaningfully change the language around collapsing Gateways. Additional restrictions there could easily be written as separate rules ("to merge Gateways, the union of their Listeners must satisfy all rules applicable to a single Gateway AND [other rules beyond those] ..." or similar).

@mlavacca
Copy link
Member

After a chat with @shaneutt and having read the PR and all the reviews' threads, I'd like to give my 2 cents: I think @pleshakov is right when they say:

Based on our discussion in the Gateway meeting from Sep 18, it was mentioned that Envoy has one level host header (virtual host) matching, which sounds similar to NGINX, which makes me believe that the cascading behavior is the consequence of configuration method used, rather than data-plane specific behavior (of Envoy or NGINX)

In the example above, the incorrect match is a consequence of a misconfiguration from the controlplane, because in case the HTTPRoute references a specific listener but the listener defines the *.example.com hostname, and the HTTPRoute wants to match cafe.example.com, the dataplane entry should not exist, and the match shouldn't happen because of all the reasons written in the thread above. In case the listener defines cafe.example.com, and the HTTPRoute wants to match *.example.com, the dataplane entry should be created by using the listener's hostname (cafe.example.com). In this scenario, cafe.example.com/coffee returns 404 as there is no entry for such a match in the dataplane, while cafe.example.com/tea returns 200 (hits listener cafe-http).

In case there is no explicit listener reference through the sectionName, cafe.example.com/coffee should hit listener cafe-http and return 200, as well as cafe.example.com/tea.

If I captured the current situation, I don't see it as a dataplane problem, but instead, as an implementations' problem that should not hinder too much the 1.0 release. Am I missing something maybe? 🤔

@pleshakov
Copy link
Contributor

@mlavacca

In the #2288 (comment), the incorrect match is a consequence of a misconfiguration from the controlplane, because in case the HTTPRoute references a specific listener but the listener defines the *.example.com hostname, and the HTTPRoute wants to match cafe.example.com, the dataplane entry should not exist, and the match shouldn't happen because of all the reasons written in the thread above

I believe the spec here allows this, specifically "A Listener with *.example.com... "

// If a hostname is specified by both the Listener and HTTPRoute, there
// must be at least one intersecting hostname for the HTTPRoute to be
// attached to the Listener. For example:
//
// * A Listener with `test.example.com` as the hostname matches HTTPRoutes
// that have either not specified any hostnames, or have specified at
// least one of `test.example.com` or `*.example.com`.
// * A Listener with `*.example.com` as the hostname matches HTTPRoutes
// that have either not specified any hostnames or have specified at least
// one hostname that matches the Listener hostname. For example,
// `*.example.com`, `test.example.com`, and `foo.test.example.com` would
// all match. On the other hand, `example.com` and `test.example.net` would
// not match.

@shaneutt shaneutt added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Sep 25, 2023
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some deliberating, and some spin-off issues (see #2416) @youngnick, @robscott and myself discussed wanting to merge this one as-is as we feel it is a net positive improvement, and anything additional can be added in a follow-up or separate PR as this change is complete within itself.

Thank you @rainest!

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 25, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rainest, shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 25, 2023
@k8s-ci-robot k8s-ci-robot merged commit 800024e into kubernetes-sigs:main Sep 25, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-blocker MUST be completed to complete the milestone release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Clarify language around listener compatibility
8 participants