From 9be878bcf6a4c10c2d1732d5afd7adb2907c0f23 Mon Sep 17 00:00:00 2001 From: Nick Young Date: Tue, 29 Nov 2022 04:43:09 +0000 Subject: [PATCH] Update Policy Attachment GEP with details about metaresources. Also distinguish between Direct and Inherited Policy Attachment. Signed-off-by: Nick Young --- geps/gep-713.md | 611 +++++++++++++++++++++-- site-src/references/policy-attachment.md | 567 ++++++++++++++++++--- 2 files changed, 1062 insertions(+), 116 deletions(-) diff --git a/geps/gep-713.md b/geps/gep-713.md index f8a4efd0f8..e635866674 100644 --- a/geps/gep-713.md +++ b/geps/gep-713.md @@ -1,29 +1,49 @@ -# GEP-713: Policy Attachment +# GEP-713: Metaresources and Policy Attachment_metaresource_ * Issue: [#713](https://github.com/kubernetes-sigs/gateway-api/issues/713) * Status: Experimental ## TLDR -This GEP aims to standardize policy attachment to resources associated with -Gateway API by establishing a pattern which defines how `Policy` API types can -have their relevant effects applied to network traffic. Individual policy APIs -(e.g. `TimeoutPolicy`, `RetryPolicy`, etc) will include a common `TargetRef` -field in their specification to identify how and where to apply that policy. -This will be important for providing a consistent experience across -implementations of the API, even for configuration details that may not be fully -portable. +This GEP aims to standardize terminology and processes around using one Kubernetes +object to modify the functions of one or more other objects. + +This GEP defines some terms, firstly: _Metaresource_. + +A Kubernetes object that that _augments_ the behavior of an object +in a standard way is called a _Metaresource_. + +This document proposes controlling the creation of configuration in the underlying +Gateway data plane using two types of Policy Attachment. +A "Policy Attachment" is a specific type of _metaresource_ that can affect specific +settings across either one object (this is "Direct Policy Attachment"), or objects +in a hierarchy (this is "Inherited Policy Attachment"). + +Individual policy APIs: +- must be their own CRDs (e.g. `TimeoutPolicy`, `RetryPolicy` etc), +- can be included in the Gateway API API group and installation or be defined by + implementations +- and must include a common `TargetRef` struct in their specification to identify + how and where to apply that policy. +- _may_ include either a `defaults` section, an `overrides` section, or both. If + these are included, the Policy is an Inherited Policy, and should use the + inheritance rules defined in this document. + +For Inherited Policies, this GEP also describes a set of expected behaviors +for how settings can flow across a defined hierarchy. + ## Goals -* Establish a pattern for policy attachment which will be used for any policies +* Establish a pattern for Policy resources which will be used for any policies included in the Gateway API spec -* Establish a pattern for policy attachment which should be used for any - implementation specific policies used with Gateway API resources +* Establish a pattern for Policy attachment, whether Direct or Inherited, + which must be used for any implementation specific policies used with + Gateway API resources * Provide a way to distinguish between required and default values for all policy API implementations * Enable policy attachment at all relevant scopes, including Gateways, Routes, - Backends + Backends, along with how values should flow across a hierarchy if necessary * Ensure the policy attachment specification is generic and forward thinking enough that it could be easily adapted to other grouping mechanisms like Namespaces in the future @@ -38,24 +58,210 @@ portable. * Define all potential policies that may be attached to resources * Design the full structure and configuration of policies +## Background and concepts + +When designing Gateway API, one of the things we’ve found is that we often need to be +able change the behavior of objects without being able to make changes to the spec +of those objects. Sometimes, this is because we can’t change the spec of the object +to hold the information we need ( ReferenceGrant, from +[GEP-709](https://gateway-api.sigs.k8s.io/geps/gep-709/), affecting Secrets +and Services is an example, as is Direct Policy Attachment), and sometimes it’s +because we want the behavior change to flow across multiple objects +(this is what Inherited Policy Attachment is for). + +To put this another way, sometimes we need ways to be able to affect how an object +is interpreted in the API, without representing the description of those effects +inside the spec of the object. + +This document describes the ways we design objects to meet these two use cases, +and why you might choose one or the other. + +We use the term “metaresource” to describe the class of objects that _only_ augment +the behavior of another Kubernetes object, regardless of what they are targeting. + +“Meta” here is used in its Greek sense of “more comprehensive” +or “transcending”, and “resource” rather than “object” because “metaresource” +is more pronounceable than “metaobject”. Additionally, a single word is better +than a phrase like “wrapper object” or “wrapper resource” overall, although both +of those terms are effectively synonymous with “metaresource”. + +A "Policy Attachment" is a metaresource that affects the fields in existing objects +(like Gateway or Routes), or influences the configuration that's generated in an +underlying data plane. + +"Direct Policy Attachment" is when a Policy object references a single object _only_, +and only modifies the fields of or the configuration associated with that object. + +"Inherited Policy Attachment" is when a Policy object references a single object +_and any child objects of that object_ (according to some defined hierarchy), and +modifies fields of the child objects, or configuration associated with the child +objects. + +In either case, a Policy may either affect an object by controlling the value +of one of the existing _fields_ in the `spec` of an object, or it may add +additional fields that are _not_ in the `spec` of the object. + +### Direct Policy Attachment + +A Direct Policy Attachment is tightly bound to one instance of a particular +Kind within a single namespace (or to an instance of a single Kind at cluster scope), +and only modifies the behavior of the object that matches its binding. + +As an example, one use case that Gateway API currently does not support is how +to configure details of the TLS required to connect to a backend (in other words, +if the process running inside the backend workload expects TLS, not that some +automated infrastructure layer is provisioning TLS as in the Mesh case). + +A hypothetical TLSConnectionPolicy that targets a Service could be used for this, +using the functionality of the Service as describing a set of endpoints. (It +should also be noted this is not the only way to solve this problem, just an +example to illustrate Direct Policy Attachment.) + +The TLSConnectionPolicy would look something like this: + +```yaml +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: TLSConnectionPolicy +metadata: + name: tlsport8443 + namespace: foo +spec: + targetRef: # This struct is defined as part of Gateway API + group: "" # Empty string means core - this is a standard convention + kind: Service + name: fooService + tls: + certificateAuthorityRefs: + - name: CAcert + port: 8443 + +``` + +All this does is tell an implementation, that for connecting to port `8443` on the +Service `fooService`, it should assume that the connection is TLS, and expect the +service's certificate to be validated by the chain in the `CAcert` Secret. + +Importantly, this would apply to _every_ usage of that Service across any HTTPRoutes +in that namespace, which could be useful for a Service that is reused in a lot of +HTTPRoutes. + +With these two examples in mind, here are some guidelines for when to consider +using Direct Policy Attachment: + +* The number or scope of objects to be modified is limited or singular. Direct + Policy Attachments must target one specific object. +* The modifications to be made to the objects don’t have any transitive information - + that is, the modifications only affect the single object that the targeted + metaresource is bound to, and don’t have ramifications that flow beyond that + object. +* In terms of status, it should be reasonably easy for a user to understand that + everything is working - basically, as long as the targeted object exists, and + the modifications are valid, the metaresource is valid, and this should be + straightforward to communicate in one or two Conditions. Note that at the time + of writing, this is *not* completed. +* Direct Policy Attachment _should_ only be used to target objects in the same + namespace as the Policy object. Allowing cross-namespace references brings in + significant security concerns, and/or difficulties about merging cross-namespace + policy objects. Notably, Mesh use cases may need to do something like this for + consumer policies, but in general, Policy objects that modify the behavior of + things outside their own namespace should be avoided unless it uses a handshake + of some sort, where the things outside the namespace can opt–out of the behavior. + (Notably, this is the design that we used for ReferenceGrant). + +### Inherited Policy Attachment: It's all about the defaults and overrides + +Because a Inherited Policy is a metaresource, it targets some other resource +and _augments_ its behavior. + +But why have this distinct from other types of metaresource? Because Inherited +Policy resources are designed to have a way for settings to flow down a hierarchy. + +Defaults set the default value for something, and can be overridden by the +“lower” objects (like a connection timeout default policy on a Gateway being +overridable inside a HTTPRoute), and Overrides cannot be overridden by “lower” +objects (like setting a maximum client timeout to some non-infinite value at the +Gateway level to stop HTTPRoute owners from leaking connections over time). + +Here are some guidelines for when to consider using a Inherited Policy object: + +* The settings or configuration are bound to one containing object, but affect + other objects attached to that one (for example, affecting HTTPRoutes attached + to a single Gateway, or all HTTPRoutes in a GatewayClass). +* The settings need to able to be defaulted, but can be overridden on a per-object + basis. +* The settings must be enforced by one persona, and not modifiable or removable + by a lesser-privileged persona. (The owner of a GatewayClass may want to restrict + something about all Gateways in a GatewayClass, regardless of who owns the Gateway, + or a Gateway owner may want to enforce some setting across all attached HTTPRoutes). +* In terms of status, a good accounting for how to record that the Policy is + attached is easy, but recording what resources the Policy is being applied to + is not, and needs to be carefully designed to avoid fanout apiserver load. + (This is not built at all in the current design either). + +When multiple Inherited Policies are used, they can interact in various ways, +which are governed by the following rules, which will be expanded on later in +in this document. + +* If a Policy does not affect an object's fields directly, then the resultant + Policy should be the set of all distinct fields inside the relevant Policy objects, + as set out by the rules below. +* For Policies that affect an object's existing fields, multiple instances of the + same Policy Kind affecting an object's fields will be evaluated as + though only a single Policy "wins" the right to affect each field. This operation + is performed on a _per-distinct-field_ basis. +* Settings in `overrides` stanzas will win over the same setting in a `defaults` + stanza. +* `overrides` settings operate in a "less specific beats more specific" fashion - + Policies attached _higher_ up the hierarchy will beat the same type of Policy + attached further down the hierarchy. +* `defaults` settings operate in a "more specific beats less specific" fashion - + Policies attached _lower down_ the hierarchy will beat the same type of Policy + attached further _up_ the hierarchy. +* For `defaults`, the _most specific_ value is the one _inside the object_ that + the Policy applies to; that is, if a Policy specifies a `default`, and an object + specifies a value, the _object's_ value will win. +* Policies interact with the fields they are controlling in a "replace value" + fashion. + * For fields where the `value` is a scalar, (like a string or a number) + should have their value _replaced_ by the value in the Policy if it wins. + Notably, this means that a `default` will only ever replace an empty or unset + value in an object. + * For fields where the value is an object, the Policy should include the fields + in the object in its definition, so that the replacement can be on simple fields + rather than complex ones. + * For fields where the final value is non-scalar, but is not an _object_ with + fields of its own, the value should be entirely replaced, _not_ merged. This + means that lists of strings or lists of ints specified in a Policy will overwrite + the empty list (in the case of a `default`) or any specified list (in the case + of an `override`). The same applies to `map[string]string` fields. An example + here would be a field that stores a map of annotations - specifying a Policy + that overrides annotations will mean that a final object specifying those + annotations will have its value _entirely replaced_ by an `override` setting. +* In the case that two Policies of the same type specify different fields, then + _all_ of the specified fields should take effect on the affected object. + +Examples to further illustrate these rules are given below. + ## API This approach is building on concepts from all of the alternatives discussed -below. This is very similar to the existing BackendPolicy resource in the API, +below. This is very similar to the (now removed) BackendPolicy resource in the API, but also borrows some concepts from the [ServicePolicy proposal](https://github.com/kubernetes-sigs/gateway-api/issues/611). ### Policy Attachment for Ingress -Attaching policy to Gateway resources for ingress use cases is relatively -straightforward. A policy can reference the resource it wants to apply to. +Attaching a Directly Attached Policy to Gateway resources for ingress use cases +is relatively straightforward. A policy can reference the resource it wants to +apply to. + Access is granted with RBAC - anyone that has access to create a RetryPolicy in a given namespace can attach it to any resource within that namespace. ![Simple Ingress Example](images/713-ingress-simple.png) -To build on that example, it’s possible to attach policies to more resources. -Each policy applies to the referenced resource and everything below it in terms -of hierarchy. Although this example is likely more complex than many real world +An Inherited Policy can attach to a parent resource, and then each policy +applies to the referenced resource and everything below it in terms of hierarchy. +Although this example is likely more complex than many real world use cases, it helps demonstrate how policy attachment can work across namespaces. @@ -204,7 +410,7 @@ precedence over Routes and Services below it. On the other hand, an app owner may want to set a default timeout for their Service. That would have precedence over defaults attached at higher levels such as Route or Gateway. -If using defaults and overrides, each policy resource MUST include 2 structs +If using defaults _and_ overrides, each policy resource MUST include 2 structs within the spec. One with override values and the other with default values. In the following example, the policy attached to the Gateway requires cdn to @@ -212,7 +418,7 @@ be enabled and provides some default configuration for that. The policy attached to the Route changes the value for one of those fields (includeQueryString). ```yaml -kind: GKEServicePolicy # Example of implementation specific policy name +kind: CDNCachingPolicy # Example of implementation specific policy name spec: override: cdn: @@ -227,13 +433,14 @@ spec: kind: Gateway name: example --- -kind: GKEServicePolicy +kind: CDNCachingPolicy spec: default: cdn: cachePolicy: includeQueryString: false targetRef: + type: direct kind: HTTPRoute name: example ``` @@ -243,7 +450,10 @@ precedence over the default drainTimeout value attached to the Route. At the same time, we can see that the default connectionTimeout attached to the Route has precedence over the default attached to the Gateway. -![Hierarchical Policy Example](images/713-policy-hierarchy.png) +Also note how the different resources interact - fields that are not common across +objects _may_ both end up affecting the final object. + +![Inherited Policy Example](images/713-policy-hierarchy.png) #### Supported Resources It is important to note that not every implementation will be able to support @@ -266,7 +476,7 @@ used to set defaults and requirements for an entire GatewayClass. ### Targeting External Services In some cases (likely limited to mesh) we may want to apply policies to requests to external services. To accomplish this, implementations can choose to support -a refernce to a virtual resource type: +a reference to a virtual resource type: ```yaml apiVersion: networking.acme.io/v1alpha1 @@ -282,13 +492,59 @@ spec: name: foo.com ``` +### Merging into existing `spec` fields + +It's possible (even likely) that configuration in a Policy may need to be merged +into an existing object's fields somehow, particularly for Inherited policies. + +When merging into an existing fields inside an object, Policy objects should +merge values at a scalar level, not at a struct or object level. + +For example, in the `CDNCachingPolicy` example above, the `cdn` struct contains +a `cachePolicy` struct that contains fields. If an implementation was merging +this configuration into an existing object that contained the same fields, it +should merge the fields at a scalar level, with the `includeHost`, +`includeProtocol`, and `includeQueryString` values being defaulted if they were +not specified in the object being controlled. Similarly, for `overrides`, the +values of the innermost scalar fields should overwrite the scalar fields in the +affected object. + +Implementations should not copy any structs from the Policy object directly into the +affected object, any fields that _are_ overridden should be overridden on a per-field +basis. + +In the case that the field in the Policy affects a struct that is a member of a list, +each existing item in the list in the affected object should have each of its +fields compared to the corresponding fields in the Policy. + +For non-scalar field _values_, like a list of strings, or a `map[string]string` +value, the _entire value_ must be overwritten by the value from the Policy. No +merging should take place. This mainly applies to `overrides`, since for +`defaults`, there should be no value present in a field on the final object. + +This table shows how this works for various types: + +|Type|Object config|Override Policy config|Result| +|----|-------------|----------------------|------| +|string| `key: "foo"` | `key: "bar"` | `key: "bar"` | +|list| `key: ["a","b"]` | `key: ["c","d"]` | `key: ["c","d"]` | +|`map[string]string`| `key: {"foo": "a", "bar": "b"}` | `key: {"foo": "c", "bar": "d"}` | `key: {"foo": "c", "bar": "d"}` | + + ### Conflict Resolution -It is possible for multiple policies to target the same resource. When this -happens, merging is the preferred outcome. If multiple policy resources target +It is possible for multiple policies to target the same object _and_ the same +fields inside that object. If multiple policy resources target the same resource _and_ have an identical field specified with different values, precedence MUST be determined in order of the following criteria, continuing on ties: +* Direct Policies override Inherited Policies. If preventing settings from + being overwritten is important, implementations should only use Inherited + Policies, and the `override` stanza that implies. Note also that it's not + intended that Direct and Inherited Policies should overlap, so this should + only come up in exceptional circumstances. +* Inside Inherited Policies, the same setting in `overrides` beats the one in + `defaults`. * The oldest Policy based on creation timestamp. For example, a Policy with a creation timestamp of "2021-07-15 01:02:03" is given precedence over a Policy with a creation timestamp of "2021-07-15 01:02:04". @@ -308,28 +564,37 @@ the API structure defined above and add a `gateway.networking.k8s.io/policy: true` label to the CRD. ### Status -In the future, we may consider adding a new `Policies` field to status on -Gateways and Routes. This would be a list of `PolicyTargetReference` structs -with the fields instead used to refer to the Policy resource that has been -applied. -Unfortunately, this may create more confusion than it is worth, here are some of -the key concerns: +In the current iteration of this GEP, metaresources and Policy objects don't +have any standard way to record what they're attaching to, or applying settings +to in the case of Policy Attachment. There are some recommended Condition types +defined below, but further work on the status design is required to ensure that +some problems are resolved: * When multiple controllers are implementing the same Route and recognize a - policy, it would be difficult to determine which controller should be - responsible for adding that policy reference to status. -* For this to be somewhat scalable, we'd need to limit the status entries to - policies that had been directly applied to the resource. This could get - confusing as it would not provide any insight into policies attached above or - below. + policy, it must be possible to determine which controller was + responsible for adding that policy reference to status. Adding Conditions to + status on the Policy instead can be helpful here, but we're still lacking a way + for the Route or Gateway owner to find _all_ the Policies that are influencing + their object. +* For this to be somewhat scalable, we must limit the number of status updates + that can result from a metaresource update. * Since we only control some of the resources a policy might be attached to, - adding policies to status would only be possible on Gateway API resources, not - Services or other kinds of backends. + adding policies to status would only be possible on the policy objects themselves + or on Gateway API resources, not Services or other kinds of backends. + +Previous experience in the Kubernetes API has made it clear that having a single +object that can cause status updates to occur across many other objects can have +a big performance impact, so the status design must be very carefully done to +avoid these kind of fanout problems. -Although these concerns are not unsolvable, they lead to the conclusion that -a Kubectl plugin should be our primary approach to providing visibility here, -with a possibility of adding policies to status at a later point. +However, the whole purpose of having a standardized Policy API structure and +patterns is intended to make this problem solvable both for human users and with +tooling. + +This is currently a _very_ open question. A discussion is ongoing at +[#1531](https://github.com/kubernetes-sigs/gateway-api/discussions/1531), and this +GEP will be updated with any outcomes. ### Conditions Controllers using the Gateway API policy attachment model SHOULD populate the @@ -396,14 +661,14 @@ controller implementation: 2. Although it's possible that arbitrary fields could be supported by custom policy, custom route filters, and core/extended fields concurrently, it is - strongly recommended that implementations not use multiple mechanisms for - representing the same fields. A given field should only be supported through - a single extension method. An example of potential conflict is policy - precedence and structured hierarchy, which only applies to custom policies. - Allowing a field to exist in custom policies and also other areas of the API, - which are not part of the structured hierarchy, breaks the precedence model. - Note that this guidance may change in the future as we gain a better - understanding for extension mechanisms of the Gateway API can interoperate. + recommended that implementations only use multiple mechanisms for + representing the same fields when those fields really _need_ the defaulting + and/or overriding behavior that Policy Attachment provides. For example, a + custom filter that allowed the configuration of Authentication inside a + HTTPRoute object might also have an associated Policy resource that allowed + the filter's settings to be defaulted or overridden. It should be noted that + doing this in the absence of a solution to the status problem is likely to + be *very* difficult to troubleshoot. ### Conformance Level This policy attachment pattern is associated with an "EXTENDED" conformance @@ -525,13 +790,251 @@ type RouteRule struct { ### Disadvantages * May be difficult to understand which policies apply to a request +## Examples + +This section provides some examples of various types of Policy objects, and how +merging, `defaults`, `overrides`, and other interactions work. + +### Direct Policy Attachment + +The following Policy sets the minimum TLS version required on a Gateway Listener: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + minimumTLSVersion: 1.2 + targetRef: + name: internet + group: gateway.networking.k8s.io + kind: Gateway +``` + +Note that because there is no version controlling the minimum TLS version in the +Gateway `spec`, this is an example of a non-field Policy. + +### Inherited Policy Attachment + +It also could be useful to be able to _default_ the `minimumTLSVersion` setting +across multiple Gateways. + +This version of the above Policy allows this: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + defaults: + minimumTLSVersion: 1.2 + targetRef: + name: appns + group: "" + kind: namespace +``` + +This Inherited Policy is using the implicit hierarchy that all resources belong +to a namespace, so attaching a Policy to a namespace means affecting all possible +resources in a namespace. Multiple hierarchies are possible, even within Gateway +API, for example Gateway -> Route, Gateway -> Route -> Backend, Gateway -> Route +-> Service. GAMMA Policies could conceivably use a hierarchy of Service -> Route +as well. + +Note that this will not be very discoverable for Gateway owners in the absence of +a solution to the Policy status problem. This is being worked on and this GEP will +be updated once we have a design. + +Conceivably, a security or admin team may want to _force_ Gateways to have at least +a minimum TLS version of `1.2` - that would be a job for `overrides`, like so: + +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + overrides: + minimumTLSVersion: 1.2 + targetRef: + name: appns + group: "" + kind: namespace +``` + +This will make it so that _all Gateways_ in the `default` namespace _must_ use +a minimum TLS version of `1.2`, and this _cannot_ be changed by Gateway owners. +Only the Policy owner can change this Policy. + +### Handling non-scalar values + +In this example, we will assume that at some future point, HTTPRoute has grown +fields to configure retries, including a field called `retryOn` that reflects +the HTTP status codes that should be retried. The _value_ of this field is a +list of strings, being the HTTP codes that must be retried. The `retryOn` field +has no defaults in the field definitions (which is probably a bad design, but we +need to show this interaction somehow!) + +We also assume that a Inherited `RetryOnPolicy` exists that allows both +defaulting and overriding of the `retryOn` field. + +A full `RetryOnPolicy` to default the field to the codes `501`, `502`, and `503` +would look like this: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + defaults: + retryOn: + - "501" + - "502" + - "503" + targetRef: + kind: Gateway + group: gateway.networking.k8s.io + name: we-love-retries +``` + +This means that, for HTTPRoutes that do _NOT_ explicitly set this field to something +else, (in other words, they contain an empty list), then the field will be set to +a list containing `501`, `502`, and `503`. (Notably, because of Go zero values, this +would also occur if the user explicitly set the value to the empty list.) + +However, if a HTTPRoute owner sets any value other than the empty list, then that +value will remain, and the Policy will have _no effect_. These values are _not_ +merged. + +If the Policy used `overrides` instead: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + overrides: + retryOn: + - "501" + - "502" + - "503" + targetRef: + kind: Gateway + group: gateway.networking.k8s.io + name: you-must-retry +``` + +Then no matter what the value is in the HTTPRoute, it will be set to `501`, `502`, +`503` by the Policy override. + +### Interactions between defaults, overrides, and field values + +All HTTPRoutes that attach to the `YouMustRetry` Gateway will have any value +_overwritten_ by this policy. The empty list, or any number of values, will all +be replaced with `501`, `502`, and `503`. + +Now, let's also assume that we use the Namespace -> Gateway hierarchy on top of +the Gateway -> HTTPRoute hierarchy, and allow attaching a `RetryOnPolicy` to a +_namespace_. The expectation here is that this will affect all Gateways in a namespace +and all HTTPRoutes that attach to those Gateways. (Note that the HTTPRoutes +themselves may not necessarily be in the same namespace though.) + +If we apply the default policy from earlier to the namespace: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + defaults: + retryOn: + - "501" + - "502" + - "503" + targetRef: + kind: Namespace + group: "" + name: appns +``` + +Then this will have the same effect as applying that Policy to every Gateway in +the `default` namespace - namely that every HTTPRoute that attaches to every +Gateway will have its `retryOn` field set to `501`, `502`, `503`, _if_ there is no +other setting in the HTTPRoute itself. + +With two layers in the hierarchy, we have a more complicated set of interactions +possible. + +Let's look at some tables for a particular HTTPRoute, assuming that it does _not_ +configure the `retryOn` field, for various types of Policy at different levels. + +#### Overrides interacting with defaults for RetryOnPolicy, empty list in HTTPRoute + +||None|Namespace override|Gateway override|HTTPRoute override| +|----|-----|-----|----|----| +|No default|Empty list|Namespace override| Gateway override Policy| HTTPRoute override| +|Namespace default| Namespace default| Namespace override | Gateway override | HTTPRoute override | +|Gateway default| Gateway default | Namespace override | Gateway override | HTTPRoute override | +|HTTPRoute default| HTTPRoute default | Namespace override | Gateway override | HTTPRoute override| + +#### Overrides interacting with other overrides for RetryOnPolicy, empty list in HTTPRoute +||No override|Namespace override A|Gateway override A|HTTPRoute override A| +|----|-----|-----|----|----| +|No override|Empty list|Namespace override| Gateway override| HTTPRoute override| +|Namespace override B| Namespace override B| Namespace override
first created wins
otherwise first alphabetically | Namespace override B | Namespace override B| +|Gateway override B| Gateway override B | Namespace override A| Gateway override
first created wins
otherwise first alphabetically | Gateway override B| +|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override
first created wins
otherwise first alphabetically| + +#### Defaults interacting with other defaults for RetryOnPolicy, empty list in HTTPRoute +||No default|Namespace default A|Gateway default A|HTTPRoute default A| +|----|-----|-----|----|----| +|No default|Empty list|Namespace default| Gateway default| HTTPRoute default A| +|Namespace default B| Namespace default B| Namespace default
first created wins
otherwise first alphabetically | Gateway default A | HTTPRoute default A| +|Gateway default B| Gateway default B| Gateway default B| Gateway default
first created wins
otherwise first alphabetically | HTTPRoute default A| +|HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default
first created wins
otherwise first alphabetically| + + +Now, if the HTTPRoute _does_ specify a RetryPolicy, +it's a bit easier, because we can basically disregard all defaults: + +#### Overrides interacting with defaults for RetryOnPolicy, value in HTTPRoute + +||None|Namespace override|Gateway override|HTTPRoute override| +|----|-----|-----|----|----| +|No default| Value in HTTPRoute|Namespace override| Gateway override | HTTPRoute override| +|Namespace default| Value in HTTPRoute| Namespace override | Gateway override | HTTPRoute override | +|Gateway default| Value in HTTPRoute | Namespace override | Gateway override | HTTPRoute override | +|HTTPRoute default| Value in HTTPRoute | Namespace override | Gateway override | HTTPRoute override| + +#### Overrides interacting with other overrides for RetryOnPolicy, value in HTTPRoute +||No override|Namespace override A|Gateway override A|HTTPRoute override A| +|----|-----|-----|----|----| +|No override|Value in HTTPRoute|Namespace override A| Gateway override A| HTTPRoute override A| +|Namespace override B| Namespace override B| Namespace override
first created wins
otherwise first alphabetically | Namespace override B| Namespace override B| +|Gateway override B| Gateway override B| Namespace override A| Gateway override
first created wins
otherwise first alphabetically | Gateway override B| +|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override
first created wins
otherwise first alphabetically| + +#### Defaults interacting with other defaults for RetryOnPolicy, value in HTTPRoute +||No default|Namespace default A|Gateway default A|HTTPRoute default A| +|----|-----|-----|----|----| +|No default|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute| +|Namespace default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute| +|Gateway default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute| +|HTTPRoute default B|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute|Value in HTTPRoute| + + ## Removing BackendPolicy -BackendPolicy represents the initial attempt to cover policy attachment for +BackendPolicy represented the initial attempt to cover policy attachment for Gateway API. Although this proposal ended up with a similar structure to BackendPolicy, it is not clear that we ever found sufficient value or use cases for BackendPolicy. Given that this proposal provides more powerful ways to -attach policy, it makes sense to remove BackendPolicy until we have a better -alternative. +attach policy, BackendPolicy was removed. ## Alternatives diff --git a/site-src/references/policy-attachment.md b/site-src/references/policy-attachment.md index 740a9d717e..b3d1ce48d7 100644 --- a/site-src/references/policy-attachment.md +++ b/site-src/references/policy-attachment.md @@ -1,25 +1,176 @@ -# Policy Attachment +# Metaresources and Policy Attachment -While features like timeouts, retries, and custom health checks are present in -most implementations, their details vary since there are no standards (RFCs) -around them. This makes these features less portable. So instead of pulling -these into the API, we offer a middle ground: a standard way to plug these -features in the API and offer a uniform UX across implementations. This standard -approach for policy attachment allows implementations to create their own custom -policy resources that can essentially extend Gateway API. +The Gateway API defines a Kubernetes object that _augments_ the behavior of an object +in a standard way as a _Metaresource_. ReferenceGrant +is an example of this general type of metaresource, but it is far from the only +one. -Policies attached to Gateway API resources and implementations must use the -following approach to ensure consistency across implementations of the API. -There are three primary components of this pattern: +This document also defines a concept called _Policy Attachment_, which augments +the behavior of an object to add additional settings that can't be described +within the spec of that object. -* A standardized means of attaching policy to resources. -* Support for configuring both default and override values within policy - resources. -* A hierarchy to illustrate how default and override values should interact. +Why have this class of attachment? Well, while features like timeouts, retries, +and custom health checks are present in most implementations, their details vary +since there are no standards (RFCs) around them. This makes these features less +portable. So instead of pulling these into the API, we offer a middle ground: +a standard way to plug these features in the API and offer a uniform UX across +implementations. This standard approach for policy attachment allows +implementations to create their own custom policy resources that can essentially +extend Gateway API, and have those settings flow across multiple resources (like +attaching a Policy to a Gateway and having the settings affect all HTTPRoutes +attached to that Gateway, for example). -This kind of standardization not only enables consistent patterns, it allows -future tooling such as kubectl plugins to be able to visualize all policies that -have been applied to a given resource. +This document defines how we control the creation of configuration in the underlying +Gateway data plane using two types of Policy Attachment. + +A "Policy Attachment" is a specific type of _metaresource_ that can affect specific +settings across either one object (this is "Direct Policy Attachment"), or objects +in a hierarchy (this is "Inherited Policy Attachment"). + +In either case, a Policy may either affect an object by controlling the value +of one of the existing _fields_ in the `spec` of an object, or it may add +additional fields that are _not_ in the `spec` of the object. + +### Direct Policy Attachment + +A Direct Policy Attachment is tightly bound to one instance of a particular +Kind within a single namespace (or to an instance of a single Kind at cluster scope), +and only modifies the behavior of the object that matches its binding. + +As an example, one use case that Gateway API currently does not support is how +to configure details of the TLS required to connect to a backend (in other words, +if the process running inside the backend workload expects TLS, not that some +automated infrastructure layer is provisioning TLS as in the Mesh case). + +A hypothetical TLSConnectionPolicy that targets a Service could be used for this, +using the functionality of the Service as describing a set of endpoints. (It +should also be noted this is not the only way to solve this problem, just an +example to illustrate Direct Policy Attachment.) + +The TLSConnectionPolicy would look something like this: + +```yaml +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: TLSConnectionPolicy +metadata: + name: tlsport8443 + namespace: foo +spec: + targetRef: # This struct is defined as part of Gateway API + group: "" # Empty string means core - this is a standard convention + kind: Service + name: fooService + tls: + certificateAuthorityRefs: + - name: CAcert + port: 8443 + +``` + +All this does is tell an implementation, that for connecting to port `8443` on the +Service `fooService`, it should assume that the connection is TLS, and expect the +service's certificate to be validated by the chain in the `CAcert` Secret. + +Importantly, this would apply to _every_ usage of that Service across any HTTPRoutes +in that namespace, which could be useful for a Service that is reused in a lot of +HTTPRoutes. + +With these two examples in mind, here are some guidelines for when to consider +using Direct Policy Attachment: + +* The number or scope of objects to be modified is limited or singular. Direct + Policy Attachments must target one specific object. +* The modifications to be made to the objects don’t have any transitive information - + that is, the modifications only affect the single object that the targeted + metaresource is bound to, and don’t have ramifications that flow beyond that + object. +* In terms of status, it should be reasonably easy for a user to understand that + everything is working - basically, as long as the targeted object exists, and + the modifications are valid, the metaresource is valid, and this should be + straightforward to communicate in one or two Conditions. Note that at the time + of writing, this is *not* completed. +* Direct Policy Attachment _should_ only be used to target objects in the same + namespace as the Policy object. Allowing cross-namespace references brings in + significant security concerns, and/or difficulties about merging cross-namespace + policy objects. Notably, Mesh use cases may need to do something like this for + consumer policies, but in general, Policy objects that modify the behavior of + things outside their own namespace should be avoided unless it uses a handshake + of some sort, where the things outside the namespace can opt–out of the behavior. + (Notably, this is the design that we used for ReferenceGrant). + +### Inherited Policy Attachment: It's all about the defaults and overrides + +Because an Inherited Policy is a metaresource, it targets some other resource +and _augments_ its behavior. + +But why have this distinct from other types of metaresource? Because Inherited +Policy resources are designed to have a way for settings to flow down a hierarchy. + +Defaults set the default value for something, and can be overridden by the +“lower” objects (like a connection timeout default policy on a Gateway being +overridable inside a HTTPRoute), and Overrides cannot be overridden by “lower” +objects (like setting a maximum client timeout to some non-infinite value at the +Gateway level to stop HTTPRoute owners from leaking connections over time). + +Here are some guidelines for when to consider using an Inherited Policy object: + +* The settings or configuration are bound to one containing object, but affect + other objects attached to that one (for example, affecting HTTPRoutes attached + to a single Gateway, or all HTTPRoutes in a GatewayClass). +* The settings need to able to be defaulted, but can be overridden on a per-object + basis. +* The settings must be enforced by one persona, and not modifiable or removable + by a lesser-privileged persona. (The owner of a GatewayClass may want to restrict + something about all Gateways in a GatewayClass, regardless of who owns the Gateway, + or a Gateway owner may want to enforce some setting across all attached HTTPRoutes). +* In terms of status, a good accounting for how to record that the Policy is + attached is easy, but recording what resources the Policy is being applied to + is not, and needs to be carefully designed to avoid fanout apiserver load. + (This is not built at all in the current design either). + +When multiple Inherited Policies are used, they can interact in various ways, +which are governed by the following rules, which will be expanded on later in +in this document. + +* If a Policy does not affect an object's fields directly, then the resultant + Policy should be the set of all distinct fields inside the relevant Policy objects, + as set out by the rules below. +* For Policies that affect an object's existing fields, multiple instances of the + same Policy Kind affecting an object's fields will be evaluated as + though only a single Policy "wins" the right to affect each field. This operation + is performed on a _per-distinct-field_ basis. +* Settings in `overrides` stanzas will win over the same setting in a `defaults` + stanza. +* `overrides` settings operate in a "less specific beats more specific" fashion - + Policies attached _higher_ up the hierarchy will beat the same type of Policy + attached further down the hierarchy. +* `defaults` settings operate in a "more specific beats less specific" fashion - + Policies attached _lower down_ the hierarchy will beat the same type of Policy + attached further _up_ the hierarchy. +* For `defaults`, the _most specific_ value is the one _inside the object_ that + the Policy applies to; that is, if a Policy specifies a `default`, and an object + specifies a value, the _object's_ value will win. +* Policies interact with the fields they are controlling in a "replace value" + fashion. + * For fields where the `value` is a scalar, (like a string or a number) + should have their value _replaced_ by the value in the Policy if it wins. + Notably, this means that a `default` will only ever replace an empty or unset + value in an object. + * For fields where the value is an object, the Policy should include the fields + in the object in its definition, so that the replacement can be on simple fields + rather than complex ones. + * For fields where the final value is non-scalar, but is not an _object_ with + fields of its own, the value should be entirely replaced, _not_ merged. This + means that lists of strings or lists of ints specified in a Policy will overwrite + the empty list (in the case of a `default`) or any specified list (in the case + of an `override`). The same applies to `map[string]string` fields. An example + here would be a field that stores a map of annotations - specifying a Policy + that overrides annotations will mean that a final object specifying those + annotations will have its value _entirely replaced_ by an `override` setting. +* In the case that two Policies of the same type specify different fields, then + _all_ of the specified fields should take effect on the affected object. + +Examples to further illustrate these rules are given below. ## Policy Attachment for Ingress Attaching policy to Gateway resources for ingress use cases is relatively @@ -69,9 +220,10 @@ struct included in the Gateway API. Where possible, it is recommended to use that struct directly instead of duplicating the type. ### Policy Boilerplate -The following structure MUST be used as for any Policy resource using this API -pattern. Within the spec, policy resources may omit `Override` or `Default` -fields, but at least one of them MUST be present. +The following (or something like it) SHOULD be used as for any Policy resource using this API +pattern. Within the spec, policy resources that omit both `Override` and `Default` +fields are defined as Direct Policy Attachment, and Inherited Policy Attachment must include +one or both. ```go // ACMEServicePolicy provides a way to apply Service policy configuration with @@ -120,7 +272,7 @@ type ACMEServicePolicyStatus struct { ``` ### Hierarchy -Each policy MAY include default or override values. Overrides enable admins to +Each Inherited policy MUST include default and/or override values. Overrides enable admins to enforce policy from the top down. Defaults enable app owners to provide default values from the bottom up for each individual application. @@ -140,7 +292,7 @@ precedence over Routes and Services below it. On the other hand, an app owner may want to set a default timeout for their Service. That would have precedence over defaults attached at higher levels such as Route or Gateway. -If using defaults and overrides, each policy resource MUST include 2 structs +If using defaults _and_ overrides, each policy resource MUST include 2 structs within the spec. One with override values and the other with default values. In the following example, the policy attached to the Gateway requires cdn to @@ -179,7 +331,16 @@ precedence over the default `drainTimeout` value attached to the Route. At the same time, we can see that the default `connectionTimeout` attached to the Route has precedence over the default attached to the Gateway. -![Hierarchical Policy Example](images/policy-hierarchy.png) +Also note how the different resources interact - fields that are not common across +objects _may_ both end up affecting the final object. + +![Inherited Policy Example](images/policy-hierarchy.png) + +#### Supported Resources +It is important to note that not every implementation will be able to support +policy attachment to each resource described in the hierarchy above. When that +is the case, implementations MUST clearly document which resources a policy may +be attached to. #### Attaching Policy to GatewayClass GatewayClass may be the trickiest resource to attach policy to. Policy @@ -204,60 +365,104 @@ kind: RetryPolicy metadata: name: foo spec: - default: - maxRetries: 5 + maxRetries: 5 targetRef: group: networking.example.net kind: ExternalService name: foo.com ``` +Because this CRD does _not_ have a `defaults` or `overrides` section, it is +a Direct Attached Policy. + +### Merging into existing `spec` fields + +It's possible (even likely) that configuration in a Policy may need to be merged +into an existing object's fields somehow, particularly for Inherited policies. + +When merging into an existing fields inside an object, Policy objects should +merge values at a scalar level, not at a struct or object level. + +For example, in the `CDNCachingPolicy` example above, the `cdn` struct contains +a `cachePolicy` struct that contains fields. If an implementation was merging +this configuration into an existing object that contained the same fields, it +should merge the fields at a scalar level, with the `includeHost`, +`includeProtocol`, and `includeQueryString` values being defaulted if they were +not specified in the object being controlled. Similarly, for `overrides`, the +values of the innermost scalar fields should overwrite the scalar fields in the +affected object. + +Implementations should not copy any structs from the Policy object directly into the +affected object, any fields that _are_ overridden should be overridden on a per-field +basis. + +In the case that the field in the Policy affects a struct that is a member of a list, +each existing item in the list in the affected object should have each of its +fields compared to the corresponding fields in the Policy. + +For non-scalar field _values_, like a list of strings, or a `map[string]string` +value, the _entire value_ must be overwritten by the value from the Policy. No +merging should take place. This mainly applies to `overrides`, since for +`defaults`, there should be no value present in a field on the final object. + +This table shows how this works for various types: + +|Type|Object config|Override Policy config|Result| +|----|-------------|----------------------|------| +|string| `key: "foo"` | `key: "bar"` | `key: "bar"` | +|list| `key: ["a","b"]` | `key: ["c","d"]` | `key: ["c","d"]` | +|`map[string]string`| `key: {"foo": "a", "bar": "b"}` | `key: {"foo": "c", "bar": "d"}` | `key: {"foo": "c", "bar": "d"}` | + + ### Conflict Resolution -It is possible for multiple policies to target the same resource. When this -happens, merging is the preferred outcome. If multiple policy resources target +It is possible for multiple policies to target the same object _and_ the same +fields inside that object. If multiple policy resources target the same resource _and_ have an identical field specified with different values, precedence MUST be determined in order of the following criteria, continuing on ties: +* Direct Policies should never overlap Inherited Policies. If preventing settings from + being overwritten is important, implementations should only use Inherited + Policies, and the `override` stanza that implies. +* Inside Inherited Policies, the same setting in `overrides` beats the one in + `defaults`. * The oldest Policy based on creation timestamp. For example, a Policy with a creation timestamp of "2021-07-15 01:02:03" is given precedence over a Policy with a creation timestamp of "2021-07-15 01:02:04". -* The Policy appearing first in alphabetical order by "{namespace}/{name}". For +* The Policy appearing first in alphabetical order by `{namespace}/{name}`. For example, foo/bar is given precedence over foo/baz. -### Kubectl Plugin -To help improve UX and standardization, a kubectl plugin will be developed that -will be capable of describing the computed sum of policy that applies to a given -resource, including policies applied to parent resources. - -Each Policy CRD that wants to be supported by this plugin will need to follow -the API structure defined above and add a -`gateway.networking.k8s.io/policy-attachment: ""` label to the CRD. +For a better user experience, a validating webhook can be implemented to prevent +these kinds of conflicts all together. ### Status -In the future, we may consider adding a new `Policies` field to status on -Gateways and Routes. This would be a list of `PolicyTargetReference` structs -with the fields instead used to refer to the Policy resource that has been -applied. -Unfortunately, this may create more confusion than it is worth, here are some of -the key concerns: +In the current iteration of this design, metaresources and Policy objects don't +have any standard way to record what they're attaching to, or applying settings +to in the case of Policy Attachment. Previous experience in the Kubernetes API +has made it clear that having a single object that can cause status updates to +occur across many other objects can have a big performance impact, so the status +design must be very carefully done to avoid these kind of fanout problems. + +However, the whole purpose of having a standardized Policy API structure and +patterns is intended to make this problem solvable both for human users and with +tooling. + +This is currently a _very_ open question. A discussion is ongoing at +[#1531](https://github.com/kubernetes-sigs/gateway-api/discussions/1531), and this +GEP will be updated with any outcomes. + +Some key concerns that we need to solve for status: * When multiple controllers are implementing the same Route and recognize a - policy, it would be difficult to determine which controller should be + policy, it must be possible to determine which controller was responsible for adding that policy reference to status. -* For this to be somewhat scalable, we'd need to limit the status entries to - policies that had been directly applied to the resource. This could get - confusing as it would not provide any insight into policies attached above or - below. +* For this to be somewhat scalable, we must limit the number of status updates + that can result from a metaresource update. * Since we only control some of the resources a policy might be attached to, adding policies to status would only be possible on Gateway API resources, not Services or other kinds of backends. -Although these concerns are not unsolvable, they lead to the conclusion that -a Kubectl plugin should be our primary approach to providing visibility here, -with a possibility of adding policies to status at a later point. - ### Interaction with Custom Route Filters Both Policy attachment and custom Route filters provide ways to extend Gateway API. Although similar in nature, they have slightly different purposes. @@ -267,7 +472,7 @@ middleware embedded inside Route rules or backend references. Policy attachment is more broad in scope. In contrast with filters, policies can be attached to a wide variety of Gateway API resources, and include a concept of -hierarchical defaulting and overrides. Although Policy attachment can be used to +inherited defaulting and overrides. Although Policy attachment can be used to target an entire Route or Backend, it cannot currently be used to target specific Route rules or backend references. If there are sufficient use cases for this, policy attachment may be expanded in the future to support this fine @@ -284,17 +489,17 @@ standardization and, over time, to absorb capabilities into the API as first class fields, which offer a more streamlined UX than custom policy attachment. -#### 2. Custom filters and policies should not overlap +#### 2. Custom filters and policies should only overlap if necessary Although it's possible that arbitrary fields could be supported by custom policy, custom route filters, and core/extended fields concurrently, it is -strongly recommended that implementations not use multiple mechanisms for -representing the same fields. A given field should only be supported through a -single extension method. An example of potential conflict is policy precedence -and structured hierarchy, which only applies to custom policies. Allowing a -field to exist in custom policies and also other areas of the API, which are not -part of the structured hierarchy, breaks the precedence model. Note that this -guidance may change in the future as we gain a better understanding of how -extension mechanisms of the Gateway API can interoperate. +recommended that implementations only use multiple mechanisms for +representing the same fields when those fields really _need_ the defaulting +and/or overriding behavior that Policy Attachment provides. For example, a +custom filter that allowed the configuration of Authentication inside a +HTTPRoute object might also have an associated Policy resource that allowed +the filter's settings to be defaulted or overridden. It should be noted that +doing this in the absence of a solution to the status problem is likely to +be *very* difficult to troubleshoot. ### Conformance Level This policy attachment pattern is associated with an "EXTENDED" conformance @@ -303,3 +508,241 @@ the same behavior and semantics, although they may not be able to support attachment of all types of policy at all potential attachment points. When that is the case, implementations MUST clearly document which resources a policy may be attached to. + +## Examples + +This section provides some examples of various types of Policy objects, and how +merging, `defaults`, `overrides`, and other interactions work. + +### Direct Policy Attachment + +The following Policy sets the minimum TLS version required on a Gateway Listener: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + minimumTLSVersion: 1.2 + targetRef: + name: internet + group: gateway.networking.k8s.io + kind: Gateway +``` + +Note that because there is no version controlling the minimum TLS version in the +Gateway `spec`, this is an example of a Policy that affects fields that aren't +represented in the object. + +### Inherited Policy Attachment + +It also could be useful to be able to _default_ the `minimumTLSVersion` setting +across multiple Gateways. + +This version of the above Policy allows this: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + defaults: + minimumTLSVersion: 1.2 + targetRef: + name: appns + group: "" + kind: namespace +``` + +This Inherited Policy is using the implicit hierarchy that all resources belong +to a namespace, so attaching a Policy to a namespace means affecting all possible +resources in a namespace. Multiple hierarchies are possible, even within Gateway +API, for example Gateway -> Route, Gateway -> Route -> Backend, Gateway -> Route +-> Service. GAMMA Policies could conceivably use a hierarchy of Service -> Route +as well. + +Note that this will not be very discoverable for Gateway owners in the absence of +a solution to the Policy status problem. + +Conceivably, a security or admin team may want to _force_ Gateways to have at least +a minimum TLS version of `1.2` - that would be a job for `overrides`, like so: + +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: TLSMinimumVersionPolicy +metadata: + name: minimum12 + namespace: appns +spec: + overrides: + minimumTLSVersion: 1.2 + targetRef: + name: appns + group: "" + kind: namespace +``` + +This will make it so that _all Gateways_ in the `default` namespace _must_ use +a minimum TLS version of `1.2`, and this _cannot_ be changed by Gateway owners. +Only the Policy owner can change this Policy. + +### Handling non-scalar values + +In this example, we will assume that at some future point, HTTPRoute has grown +a Filter to configure retries (`RetryFilter`), including a field called `retryOn` +that reflects the HTTP status codes that should be retried. The _value_ of this +field is a list of strings, being the HTTP codes that must be retried. The `retryOn` +field has no defaults in the field definitions (which is probably a bad design, +but we need to show this interaction somehow!) + +We also assume that an Inherited `RetryOnPolicy` exists that allows both +defaulting and overriding of the `retryOn` field in the `RetryFilter`. + +A full `RetryOnPolicy` to default the field to the codes `501`, `502`, and `503` +would look like this: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + defaults: + retryOn: + - "501" + - "502" + - "503" + targetRef: + kind: Gateway + group: gateway.networking.k8s.io + name: we-love-retries +``` + +This means that, for HTTPRoutes that use the `RetryFilter and do _NOT_ explicitly set this field to something +else, (in other words, they contain an empty list), then the field will be set to +a list containing `501`, `502`, and `503`. (Notably, because of Go zero values, this +would also occur if the user explicitly set the value to the empty list.) + +However, if a HTTPRoute owner sets any value other than the empty list in the filter, then that +value will remain, and the Policy will have _no effect_. These values are _not_ +merged. + +If the Policy used `overrides` instead: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + overrides: + retryOn: + - "501" + - "503" + targetRef: + kind: Gateway + group: gateway.networking.k8s.io + name: you-must-retry +``` + +Then no matter what the value is in the filter, it will be set to `501`, `503` +by the Policy override. + +### Interactions between defaults, overrides, and field values + +All HTTPRoutes that attach to the `YouMustRetry` Gateway and use a `RetryFilter` +will have any value _overwritten_ by this policy. The empty list, or any number +of values, will all be replaced with `501`, `502`, and `503`. + +Now, let's also assume that we use the Namespace -> Gateway hierarchy on top of +the Gateway -> HTTPRoute hierarchy, and allow attaching a `RetryOnPolicy` to a +_namespace_. The expectation here is that this will affect all Gateways in a namespace +and all HTTPRoutes that use the `RetryFilter` and attach to those Gateways. +(Note that the HTTPRoutes themselves may not necessarily be in the same namespace though.) + +If we apply the default policy from earlier to the namespace: +```yaml +apiVersion: networking.example.io/v1alpha1 +kind: RetryOnPolicy +metadata: + name: retryon5xx + namespace: appns +spec: + defaults: + retryOn: + - "501" + - "502" + - "503" + targetRef: + kind: Namespace + group: "" + name: appns +``` + +Then this will have the same effect as applying that Policy to every Gateway in +the `default` namespace - namely that every HTTPRoute that attaches to every +Gateway will have its `retryOn` field in the `RetryFilter` set to `501`, `502`, `503`, +_if_ there is no other setting in the `RetryFilter` itself. + +With two layers in the hierarchy, we have a more complicated set of interactions +possible. + +Let's look at some tables for a particular HTTPRoute, assuming that it does _not_ +configure the `retryOn` field, but _does_ configure a `RetryFilter`, for various +types of Policy at different levels. + +#### Overrides interacting with defaults for RetryOnPolicy, empty list in RetryFilter + +||None|Namespace override|Gateway override|HTTPRoute override| +|----|-----|-----|----|----| +|No default|Empty list|Namespace override| Gateway override Policy| HTTPRoute override| +|Namespace default| Namespace default| Namespace override | Gateway override | HTTPRoute override | +|Gateway default| Gateway default | Namespace override | Gateway override | HTTPRoute override | +|HTTPRoute default| HTTPRoute default | Namespace override | Gateway override | HTTPRoute override| + +#### Overrides interacting with other overrides for RetryOnPolicy, empty list in RetryFilter +||No override|Namespace override A|Gateway override A|HTTPRoute override A| +|----|-----|-----|----|----| +|No override|Empty list|Namespace override| Gateway override| HTTPRoute override| +|Namespace override B| Namespace override B| Namespace override
first created wins
otherwise first alphabetically | Namespace override B | Namespace override B| +|Gateway override B| Gateway override B | Namespace override A| Gateway override
first created wins
otherwise first alphabetically | Gateway override B| +|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override
first created wins
otherwise first alphabetically| + +#### Defaults interacting with other defaults for RetryOnPolicy, empty list in RetryFilter +||No default|Namespace default A|Gateway default A|HTTPRoute default A| +|----|-----|-----|----|----| +|No default|Empty list|Namespace default| Gateway default| HTTPRoute default A| +|Namespace default B| Namespace default B| Namespace default
first created wins
otherwise first alphabetically | Gateway default A | HTTPRoute default A| +|Gateway default B| Gateway default B| Gateway default B| Gateway default
first created wins
otherwise first alphabetically | HTTPRoute default A| +|HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default B| HTTPRoute default
first created wins
otherwise first alphabetically| + + +Now, if the HTTPRoute _does_ specify a value in its `RetryFilter`, +it's a bit easier, because we can basically disregard all defaults: + +#### Overrides interacting with defaults for RetryOnPolicy, value in RetryFilter + +||None|Namespace override|Gateway override|HTTPRoute override| +|----|-----|-----|----|----| +|No default| Value in RetryFilter|Namespace override| Gateway override | HTTPRoute override| +|Namespace default| Value in RetryFilter| Namespace override | Gateway override | HTTPRoute override | +|Gateway default| Value in RetryFilter | Namespace override | Gateway override | HTTPRoute override | +|HTTPRoute default| Value in RetryFilter | Namespace override | Gateway override | HTTPRoute override| + +#### Overrides interacting with other overrides for RetryOnPolicy, value in RetryFilter +||No override|Namespace override A|Gateway override A|HTTPRoute override A| +|----|-----|-----|----|----| +|No override|Value in RetryFilter|Namespace override A| Gateway override A| HTTPRoute override A| +|Namespace override B| Namespace override B| Namespace override
first created wins
otherwise first alphabetically | Namespace override B| Namespace override B| +|Gateway override B| Gateway override B| Namespace override A| Gateway override
first created wins
otherwise first alphabetically | Gateway override B| +|HTTPRoute override B| HTTPRoute override B | Namespace override A| Gateway override A| HTTPRoute override
first created wins
otherwise first alphabetically| + +#### Defaults interacting with other defaults for RetryOnPolicy, value in RetryFilter +||No default|Namespace default A|Gateway default A|HTTPRoute default A| +|----|-----|-----|----|----| +|No default|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter| +|Namespace default B|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter| +|Gateway default B|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter| +|HTTPRoute default B|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter|Value in RetryFilter|