[WIP] KEP-3903: Unknown Version Interoperability Proxy #3903

lavalamp · 2023-03-09T23:18:12Z

(not at all ready for review, just getting this started)

One-line PR description: Adds a KEP.

Issue link:

Other comments:

k8s-ci-robot · 2023-03-09T23:18:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-api-machinery/OWNERS~~ [lavalamp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

logicalhan · 2023-03-10T23:53:49Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+## Proposal
+
+API changes:
+* To the apiservices API, add an "alternates" clause, a list of


I prefer a "serviceable" or "serviceableBy" clause as opposed to "alternates".

logicalhan · 2023-03-10T23:58:09Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+## Summary
+
+When a cluster has multiple apiservers at mixed versions (such as during an
+upgrade or downgrate), not every apiserver can serve every resource at every


Suggested change

upgrade or downgrate), not every apiserver can serve every resource at every

upgrade or downgrade), not every apiserver can serve every resource at every

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

tallclair · 2023-03-22T17:42:18Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+
+* Ensure discovery reports the same set of resources everywhere (not just group
+  versions, as it does today)
+* Proxy client traffic to an apiserver that can service the request


Have you considered instead disabling resources which are not served by all API servers?

Then you end up with an intersection of the resources of both sets of apiservers, which is still going to lead to wonky behavior from the GC and NLC controllers.

amend UVIP KEP with some additional details

lavalamp · 2023-03-23T22:46:53Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+By routing all discovery requests to the newest apiserver, we can ensure that namespace and gc
+controllers do what they would be doing if the upgrade happened instantaneously. 
+
+Alternatively, we can use the storage version objects to reconstruct a merged discovery


@alexzielenski @apelisse @Jefftree Thoughts on this?

IIUC, the aggregators will have to know all the resources, even those that aren't handled by its own apiserver (so that it can proxy them somewhere else). So I think it needs to have its own, synced, representation of what is available, and so it should be able to serve that representation in the form of a discovery.

Just because we can, doesn't mean we should. Yes, we should be able to serve a synced representation of what resources are available, but what does supporting this intermediate state actually buy us?

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

jpbetz · 2023-03-27T18:04:24Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+This might be a good place to talk about core concepts and how they relate.
+-->
+
+### Risks and Mitigations


In a cluster of N apiservers under balanced load, proxying requests to a single apiserver could increase the load by a factor of N (worst case)? Is it reasonable to expect 1 apiserver will be able to serve the load previously served by N apiservers? Could this make upgrades more dangerous?

If so, should "Risks and Mitigations" include a section about this?

yes, it's coming.

The answer is that heavily trafficked resources will not be added or removed.

I mean, there are zonal clusters that handle loads which vary from very small to very big, so why would we expect a single apiserver instance not to be able to handle the load?

Yeah, I think this is probably safe. It's just good to write some reasoning down. My thinking was that you'll maybe have some new APIs in a upgrade but client's won't be using them yet during upgrade. During downgrade, maybe some clients will be using the APIs, but the vast majority of traffic is for things that can be served by all API servers. Combined with the fact that you don't lose any etcd capacity and you're required to over-provisioned to handle N-1 apiservers, I think this is OK.

so why would we expect a single apiserver instance not to be able to handle the load?

We definitely have clusters in the fleet where single instance can't handle the load.

That said - I agree with Daniel here - the answer that we shouldn't be removing heavily-loaded resources is a reasonable one and I think it's already true in general.
The rollback is an interesting case that should be mentioned explicitly though (but it should generally also be true, because the same problem as appears during rollback can actually appear during upgrade).

Yeah, I think the logic is symmetrical -- if you do a rollback that removes a now-heavily-trafficked API, you're gonna have a very bad time; this is already true today IMO

We definitely have clusters in the fleet where single instance can't handle the load.

Those should be pretty exceptional cases. For most clusters, I would imagine most of the traffic coming from CM and scheduler, which are leader elected and don't have any anti-affinity rules by default, so a single apiserver should normally be able to handle that sort of traffic.

logicalhan · 2023-03-29T00:01:16Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+
+The garbage collector makes decisions about deleting objects when all
+referencing objects are deleted. A discovery gap / apiserver mismatch, as
+described above, could result in GC seeing a 404 and assuming an object has been


If we make the 404 a 503, then this point is moot.

I updated the goals and proposal section to make it clearer why this doesn't work.

Yeah but even UVIP as currently stated doesn't really solve the problem, because in the case the apiserver is able to act on resources which end up disappearing in the newer version is kinda moot without the transition hooks which allow us to clean up our resource mess once the upgrade completes. And, in that same situation, the actions that the apiserver makes on those resources may not even make sense once the upgrade completes because those resources will essentially be orphaned.

Therefore my contention is that 503ing here is not any less correct then letting the apiserver act on old resources which will then disappear once the upgrade completes unless we also have the ability to transition between resources being enabled/disabled.

As stated I think this solves GC and namespace lifecycle controller, and doing less than this proposal would not (see all the paragraphs I added in the most recent commit).

It also enables a hypothetical clear-before-resource-removal controller, but that's not a story because it's not necessary to motivate any of the changes proposed here. GC and NLC already exercise the cases.

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

logicalhan · 2023-03-29T23:20:28Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+* Note that merely serving 503s at the right times does not solve the problem,
+  for two reasons: controllers might get an incomplete discovery and therefore
+  not ask about all the correct resources; and when they get 503 responses,
+  although the controller can avoid doing something destructive, it also can't
+  make progress and is stuck for the duration of the upgrade.


If a controller is trying to make progress it falls in one of two camps:

it's trying to actuate on a resource which is removed in a newer version. In that case, whatever it does (besides deletions) are going to end up moot, because those objects will be orphaned once the upgrade completes.

it's trying to actuate on a resource which was just introduced in the newer version. In that case, if we are routing to the newest apiserver, this just actually does the right thing.

So really, we're actually only talking about the first scenario and those resources are going to get orphaned, regardless of whether or not we end up merging the resources for discovery.

I don't agree for case 1, the controller could be trying to delete the object in which case we want it to succeed.

and anyway, we can't be sure the upgrade is going to complete.

Then the correct thing to do is to route discovery to the newest apiserver and for resources which cannot be served by the newest apiserver, we should return a 404 instead of a 503, in order to force deletions.

How does "the upgrade might not complete" imply that?

I was responding mostly to this:

I don't agree for case 1, the controller could be trying to delete the object in which case we want it to succeed.

But anyhow, it's cool, we chatted offline and I'm okay with merged discovery for beta. I just really wanted to avoid it for alpha.

it's trying to actuate on a resource which is removed in a newer version

It's much more subtle than that. Suppose that the new version is just removing the v1beta1 version (and v1 still exists). If we will be routing to newer apiservers than clearly we will fail this request. It will probably eventually succeed, but it's an unnecessary delay.

Well, that depends on the client. And during an upgrade, depending on ordering, we may see both clients that only understand v1beta1, and clients that only understand v1.

Add a note about large request volume proxied to the apiserver

Add design details for new filter in handler chain

k8s-ci-robot · 2023-05-11T22:51:41Z

@lavalamp: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-enhancements-verify	`8aa71c9`	link	true	`/test pull-enhancements-verify`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

jpbetz · 2023-05-12T14:38:57Z

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md

+For the mTLS between source and destination apiservers, we will do the following
+
+1. For server authentication by the client (source apiserver) : the client needs to validate the server certs (presented by the destination apiserver), for which it needs to know the CA bundle of the authority that signed those certs. We should be able to reuse the bundle given to all pods to verify whatever kube-apiserver instance they talk to (currently passed to kube-controller-manager as --root-ca-file)
+
+2. For client authentication by the server (destination apiserver) : destination apiserver will check the source apiserver certs to determine that the proxy request is from an authenticated client. The destination apiserver will use requestheader authentication (and NOT client cert authentication) for this using the kube-aggregator proxy client cert/key and the --requestheader-client-ca-file passed to the apiserver upon bootstrap


etcd has configuration for both client-to-server and peer-to-peer traffic. Long term, I expect that we'll need to converge to configuration options that are similar, only we also have kubelet traffic, so there actually three "directions" or traffic.

Can we model this after etcd (https://etcd.io/docs/v3.2/op-guide/security/)? I'm OK with the peer-to-peer traffic defaulting to the approach defined here (for the nice out-of-the-box experience), but I think we should also offer a dedicated set of optional flags to configure this new direction of traffic (e.g. --peer-advertise-address, --peer-bind-address, --peer-client-ca-file ,...) for cluster administrators that want or require the ability to configure this direction of traffic differently.

cc @deads2k

richabanker · 2023-05-17T20:04:44Z

/close
in favor of #4015

k8s-ci-robot · 2023-05-17T20:04:50Z

@richabanker: Closed this PR.

In response to this:

/close
in favor of #4015

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

begin UVIP KEP

00a2cf7

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 9, 2023

k8s-ci-robot requested a review from deads2k March 9, 2023 23:18

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Mar 9, 2023

k8s-ci-robot requested a review from jpbetz March 9, 2023 23:18

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 9, 2023

add kep number

0b4ebf4

lavalamp changed the title ~~[WIP] KEP-NNNN: Unknown Version Interoperability Proxy~~ [WIP] KEP-3903: Unknown Version Interoperability Proxy Mar 9, 2023

more details

a9a98bd

logicalhan reviewed Mar 10, 2023

View reviewed changes

logicalhan reviewed Mar 11, 2023

View reviewed changes

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md Show resolved Hide resolved

remove misunderstanding about StorageVersion API

6ce7c55

tallclair reviewed Mar 22, 2023

View reviewed changes

Han Kang and others added 4 commits March 23, 2023 11:10

amend UVIP KEP with some additional details

2759e2e

add a bit about making discovery consistent

8cd8369

move discovery consistency to unresolved section

59ee6ce

Merge pull request #20 from logicalhan/uvip

a134f0f

amend UVIP KEP with some additional details

lavalamp commented Mar 23, 2023

View reviewed changes

wojtek-t reviewed Mar 24, 2023

View reviewed changes

keps/sig-api-machinery/3903-unknown-version-interoperability-proxy/README.md Outdated Show resolved Hide resolved

jpbetz reviewed Mar 27, 2023

View reviewed changes

resolve section, add criteria known so far

11b36e8

logicalhan reviewed Mar 29, 2023

View reviewed changes

adjust goals to be clearer about what is needed

649d8b0

logicalhan reviewed Mar 29, 2023

View reviewed changes

richabanker and others added 2 commits April 4, 2023 16:10

Add a note about large request volume directed at the apiserver

1ba3572

Merge pull request #21 from Richabanker/uvip-kep-3

91de7a5

Add a note about large request volume proxied to the apiserver

richabanker and others added 2 commits May 11, 2023 15:36

Add design details for new filter in handler chain

d26ad2b

Merge pull request #25 from Richabanker/uvip-kep

8aa71c9

Add design details for new filter in handler chain

jpbetz reviewed May 12, 2023

View reviewed changes

richabanker mentioned this pull request May 17, 2023

KEP-4020: Unknown Version Interoperability Proxy #4015

Merged

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 17, 2023

k8s-ci-robot closed this May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] KEP-3903: Unknown Version Interoperability Proxy #3903

[WIP] KEP-3903: Unknown Version Interoperability Proxy #3903

lavalamp commented Mar 9, 2023 •

edited

Loading

k8s-ci-robot commented Mar 9, 2023

logicalhan Mar 10, 2023

logicalhan Mar 10, 2023

tallclair Mar 22, 2023

logicalhan Mar 29, 2023

lavalamp Mar 23, 2023

apelisse Mar 27, 2023 •

edited

Loading

logicalhan Mar 27, 2023

jpbetz Mar 27, 2023 •

edited

Loading

lavalamp Mar 27, 2023

logicalhan Mar 28, 2023 •

edited

Loading

jpbetz Mar 28, 2023 •

edited

Loading

wojtek-t Mar 29, 2023

lavalamp Mar 29, 2023

logicalhan Mar 29, 2023

logicalhan Mar 29, 2023

lavalamp Mar 29, 2023

logicalhan Mar 29, 2023

lavalamp Mar 29, 2023

logicalhan Mar 29, 2023

lavalamp Mar 29, 2023

lavalamp Mar 29, 2023

logicalhan Mar 29, 2023

lavalamp Mar 29, 2023

logicalhan Mar 29, 2023

wojtek-t Mar 30, 2023

lavalamp Mar 30, 2023

k8s-ci-robot commented May 11, 2023

jpbetz May 12, 2023 •

edited

Loading

richabanker commented May 17, 2023

k8s-ci-robot commented May 17, 2023

	upgrade or downgrate), not every apiserver can serve every resource at every
	upgrade or downgrade), not every apiserver can serve every resource at every

[WIP] KEP-3903: Unknown Version Interoperability Proxy #3903

[WIP] KEP-3903: Unknown Version Interoperability Proxy #3903

Conversation

lavalamp commented Mar 9, 2023 • edited Loading

k8s-ci-robot commented Mar 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apelisse Mar 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpbetz Mar 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

logicalhan Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

jpbetz Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented May 11, 2023

jpbetz May 12, 2023 • edited Loading

Choose a reason for hiding this comment

richabanker commented May 17, 2023

k8s-ci-robot commented May 17, 2023

lavalamp commented Mar 9, 2023 •

edited

Loading

apelisse Mar 27, 2023 •

edited

Loading

jpbetz Mar 27, 2023 •

edited

Loading

logicalhan Mar 28, 2023 •

edited

Loading

jpbetz Mar 28, 2023 •

edited

Loading

jpbetz May 12, 2023 •

edited

Loading