KEP: Instance-specific ComponentConfig #1439

mtaufen · 2020-01-08T21:44:36Z

Adds a KEP for solving the instance-specific config field problem in
ComponentConfig APIs. This is a blocker to the overall ComponentConfig
effort.

mtaufen · 2020-01-08T21:47:56Z

/assign @liggitt @sttts
/cc @luxas @stealthybox @rosti @neolit123 @kubernetes/wg-component-standard

neolit123

thanks @mtaufen
i'm +1 on getting a KEP for this merged and making this the standard for components.

added some general comments and minor suggestions.

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

neolit123 · 2020-01-09T01:01:30Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+    spec:
+      initContainers:
+      - name: instance-config
+        image: something-with-bash


off topic:
with most / all of our images moving to distroless one might have to trust a third party image source to perform the instance specific amend. wondering if we should have a recommended image for this at the k8s GCR (at some point).

I like the idea of having a recommended image. As mentioned below, the process of manually creating the instance specific config is quite verbose and could be unintuitive for users ("Why do I need to create this file when I can just pass a flag?"). I could imagine providing a specific image that can be used as an init container which you just pass the values through environment variables would help ease the transition and increase adoption.

initContainers: - name: instance-config image: k8s.gcr.io/init-kube-proxy-instance-config:v1alpha1 env: - NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName

Of course the downside would be that we will have to maintain these images as well.

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

neolit123 · 2020-01-09T01:08:50Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+### Graduation Criteria
+
+A beta-versioned ComponentConfig API is using the instance-specific object
+approach.


i personally do not think this KEP needs alpha->beta->stable graduation stages and this section can be left to NONE with a minor clarification?
the proposal defines the standard in k8s and the components are recommended to use it.

I think could make sense to implement it and get it to beta, as a sort of proof of the standard, before we claim it's "graduated."

neolit123 · 2020-01-09T01:10:41Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+Immutable downgrades follow the same approach: deploy the new resources with
+the old version and old config.
+
+In-place upgrades, which are not typically recommended, should not be


i have not seen data on this, but i think the majority of k8s users are doing in-place either because their tools only work that way or because they don't have the infrastructure to do re-place.
possibly omit which are not typically recommended?

Yeah, that's fair. I don't have a good idea of what users outside of GKE, which does the immutable node upgrade thing, are doing for upgrades. IIRC the tribal knowledge from back when I was attending SIG-Node was that in-place Kubelet upgrades weren't really supported, and AFAIK it's not what Cluster API does, but neither of those is necessarily a reflection of what people actually do.

Updated the KEP.

neolit123 · 2020-01-09T01:15:12Z

@kubernetes/sig-cluster-lifecycle-pr-reviews

sttts · 2020-01-09T09:04:43Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+
+This creates an additional bucket for config parameters. We must be vigilant
+that only parameters that are truly instance-specific are added to the
+instance-specific Kind. We can provide warnings to developers via comments


missing in the proposal above: the new kind defines which parameters are instance-specific and which are not.

Also: is there an overlap?

It's below in design details. I wasn't sure which section to put it in, Proposal seemed like it should maybe be light on details.

There should be no overlap allowed between the two, IMO. To start, I'd like to keep the instance-specific to only the fields that clearly cannot be the same between instances sharing the same config object. Maybe in the future we can also include stuff that feels "lower level," but that's more wishy-washy.

Please at that very centrally in the summary and the proposed solution. IMO this is the key idea of the proposal.

sttts · 2020-01-09T09:07:48Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+The canonical flag for providing a ComponentConfig file to a component is
+`--config`. This KEP proposes that the canonical flag for providing an
+instance-specific config file be `--instance-config`, and that the
+instance-specific object not initially be permitted as part of a yaml stream


cannot follow this sentence

What I was trying to express was that you can't provide one Kind to the other's flag, and you also can't provide a multidoc to either that includes the other. So none of these would be allowed:

--config=KubeletInstanceConfiguration
--instance-config=KubeletConfiguration
--config=KubeletConfiguration(---)KubeletInstanceConfiguration
--instance-config=KubeletConfiguration(---)KubeletInstanceConfiguration

I think the multiple --config invocations, below, is also an interesting idea, though it is more complex to implement.

sttts · 2020-01-09T09:09:19Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+
+As with sharable ComponentConfig parameters, command line flags for
+instance-specific parameters should continue to function and take precedence
+over values from the config file so that backwards compatibility is maintained.


also here: "precedence" meens overriding oder merging?

Yes. Flags override config.

please be explicit in the text about this. It's crucial.

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

rosti

Thanks @mtaufen !
That's an excellent start!

rosti · 2020-01-09T14:08:34Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+component, usually containing a single top-level Kind for configuring that
+component. The naming convention for the API group is `{component}.config.k8s.io`,
+and the convention for the Kind is `{Component}Configuration`. This KEP proposes
+that the naming convention for the new Kind be `{Component}InstanceConfiguration`.


We can outline these as a set of best practices and not force them on components.
In the case of kubeadm, InitConfiguration and JoinConfiguration are instance specific kinds. They are different because of the specifics of the kubeadm workflow (InitConfiguration is used on kubeadm init which is done on a single machine - the first to be initialized to form a "cluster". JoinConfiguration is used on kubeadm join to join a machine to an existing cluster).

Most other components work as a service. They don't have user facing specific workflows as kubeadm. Hence a single instance specific kind is sufficient for them.
Not imposing, but recommending things here will make kubeadm compatible out-of-the-box with this KEP.

Another thing is the {Component} prefix. In my opinion it should be optional. For example, look at the current kube-proxy component config:

apiVersion: kube-proxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration

We have "Kube Proxy" in both the API group and the kind. I don't think that it sounds right to have KubeProxyInstanceConfiguration and KubeProxyConfiguration in the kube-proxy.config.k8s.io group. However, somebody could like it and want to stick to it.

In my opinion a viable option should be InstanceConfiguration and SharedConfiguration in kube-proxy.config.k8s.io. Some components may desire to use ClusterConfiguration in place of SharedConfiguration (as kubeadm does ATM).

rosti · 2020-01-09T14:18:31Z

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md

+instance-specific config file be `--instance-config`, and that the
+instance-specific object not initially be permitted as part of a yaml stream
+in the `--config` file (and vice-versa). This is for the sake of a simple
+implementation and can be enabled in the future, if we decide it is useful.


Possible multiple --config flags (instead of --instance-config one) is probably the most flexible way of dealing with multiple kinds in such way. Some users may want to store the config in a couple of files (one instance specific and generated by an init container and another shared, provided by a config map volume source).
Others may desire to store both kinds in a single file and feed that in the component. That is already used by kubeadm (storing together InitConfiguration and ClusterConfiguration). kubeadm can also provision kubelet and static pod backed components (API Server, Scheduler and Controller Manager) in that way.

Implementation wise, we can probably add a single piece of code somewhere in component-base to deal with that in a common way.

Adds a KEP for solving the instance-specific config field problem in ComponentConfig APIs. This is a blocker to the overall ComponentConfig effort.

mtaufen · 2020-01-14T18:46:55Z

An interesting point came up in today's meeting, which is that a bind address may be sharable (all interfaces) or not (specific interface) depending on what the user wants.

This class of option is an argument in favor of allowing multiple --config invocations to be a piecemeal override. An override also means we don't have to split the Kind.

I'm not sure if we want to go to that level of complexity yet, as config provenance gets harder to track with multiple sources (we're already in that boat since config might come from flags or files for the time being). Deep merges, especially, can be tricky to implement (though maybe we can use parts of kustomize as prior art). I'm still on the side of splitting the Kind for simplicity, and probably allowing multiple --config invocations instead of an --instance-config flag, but not doing a merge yet.

Maybe that means we need flags for bind address overrides to stick around for a bit. Or maybe it means bind addresses should also end up in the instance-specific file (we'll have to rev. Kubelet to v1beta2 to make that change, if so). The number of fields that can't in most cases share a value seems very low compared to those that can share, so I'd like to choose the simplest option that moves ComponentConfig forward, and revisit the advanced (merge) solution when ComponentConfig is actually a thing most people are able to use.

sttts · 2020-01-15T11:09:59Z

My take: two kinds allow us to explicitly split values into shared and instance specific ones. This removes the need to decide what to do for merging of values that are in both buckets.

If we add the later at some point, we have two ways forward:

define a general semantics for merges. Server side apply's merging logic with map-types and list-types might be a great start.
do not tackle general merge semantics at all, but let the ComponentConfig API define case-by-case what merging means.

Nobody dared to tackle (1). (2) has the danger of inconsistent ad-hoc semantics. And my fear is that this danger is real and lets us end up at a spot we will not like, and because its an API, there is no way back.

Again, this KEP must be explicit about the path forward, or at least lay out these two options and explicitly leave it open for later.

liggitt · 2020-01-20T16:12:49Z

I see three categories of fields:

clearly instance-specific (e.g. --node-ip, --provider-id, --override-hostname)
possibly instance-specific (e.g. --bind-address, --root-dir)
highly unlikely to be instance-specific (e.g. cloud provider, node lease duration)

two kinds allow us to explicitly split values into shared and instance specific ones.

Defining a type to contain category 1 makes sense. Most deployers will have more than one node. Deployers with a single node can generate two config stanzas without much more trouble than one.

Keeping fields in category 3 in the shared config type also makes sense. If there are edge use-cases that for some reason requires someone to provide instance-specific values, they can modify the config file themselves prior to launching the component. I don't think we need to build application of stacks of patches to config into components. Note that this is a point in favor of components just consuming simple configuration files and reacting to changes in those config files by exiting and allowing themselves to be restarted (like kube-proxy does), rather than pulling dynamic config directly from the API server internally within the component (like kubelet dynamic config does).

@sttts:
This removes the need to decide what to do for merging of values that are in both buckets.

It seems like we still have to decide what category every field is in, and the things in category 2 are tricky.

Would we be committed to putting all potentially instance-specific config values in an "instance-specific" type regardless of how weak the use case was?

If so, that means that many consumers that don't have those use cases will need to provide an "instance-specific" config with values that are identical for all their instances.
If not, what would we tell people with those use cases to do?

More importantly, I assume we will not categorize everything correctly, and will later discover use cases that would make us want a shared field to be able to be instance-specific. I'd like to settle on an approach now that will make this possible/intuitive in the future without causing churn for all consumers (like revving existing shared config versions). For example, would it make sense to keep the field in the shared config, add it to the instance-specific config, and document that a value present in the instance-specific config overrides the value in the shared config? That is not nearly as open-ended as trying to define semantics for "merge/override any arbitrary config field".

@mtaufen:
Maybe that means we need flags for bind address overrides to stick around for a bit. Or maybe it means bind addresses should also end up in the instance-specific file (we'll have to rev. Kubelet to v1beta2 to make that change, if so).

We have to rev if we remove the fields from the shared config type, but not if we add to instance-specific config and define the semantics of providing a value that way, right?

@mtaufen:
The number of fields that can't in most cases share a value seems very low compared to those that can share, so I'd like to choose the simplest option that moves ComponentConfig forward, and revisit the advanced (merge) solution when ComponentConfig is actually a thing most people are able to use.

It seems likely that we will have options that are instance-specific in some clusters and shared in other clusters. It also seems likely that we will not perfectly categorize fields on the first attempt. Describing reasonable ways to handle those scenarios here would be beneficial, so we don't end up with n components doing n different things.

sttts · 2020-12-11T09:28:20Z

keps/sig-cluster-lifecycle/wgs/2141-instance-specific-componentconfig/README.md

+### Configuration merge example
+
+The merge between both YAML configuration files is provided by the
+[Strategic Merge Patch](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/strategic-merge-patch.md)


this library is dead and going to be replaced by server-side-apply. Has it been evaluated to use the latter?

the SMP looked like the best approach since it was possible to configure field struct tags and was still being used internally in some parts.

Thanks for informing the deprecation, I will rewrite the KEP using SSA instead.

sttts · 2020-12-11T09:29:40Z

keps/sig-cluster-lifecycle/wgs/2141-instance-specific-componentconfig/README.md

+    cacheAuthorizedTTL: 0s
+    cacheUnauthorizedTTL: 0s
+clusterDNS:
+- 10.96.0.11


how do you know this should replace and not merge? I mean technically it is clear. But how do you know the user does not want to merge?

The user will not have control over each field, at least on this point, the choice is still bound to the implementation tagging.

sttts · 2020-12-11T09:29:59Z

keps/sig-cluster-lifecycle/wgs/2141-instance-specific-componentconfig/README.md

+The approach of performing a strategic merge patch of instance-specific over
+shareable config before unmarshalling can decrease the complexity by:
+
+- Using an already proven method from apimachinery.


proven and dead

knabben · 2020-12-21T22:09:23Z

@apelisse could you give us some input on the KEP and SSA best directions for our use-cases? thanks.

apelisse · 2021-01-06T22:37:40Z

Maybe I'm wrong, but it sounds a lot like it's trying to reinvent kustomize. @monopole

AFAICT, using server-side apply should be fairly equivalent to using SMP. Most problems you're going to have with SSA or SMP is deletion which is lacking in the former, and tedious in the latter. Having the ability to apply json patch specifically could be useful. But, I think @monopole has the most context on that exact problem set.

knabben · 2021-01-19T22:43:29Z

@sttts After some conversation with @apelisse and @monopole we have decided to go with KYaml (from Kustomize) to replace the SMP merge functionality, the 2-way merge capability is similar to the first and works very well in this use-case. The other point is the adoption by sig-cli. If this sounds appealing to you, I can rewrite the examples in the KEP on this direction and continue from there.

sttts · 2021-01-20T09:44:43Z

@knabben kyaml looks more maintained than SMB. So that's good news. I don't see much documentation in the repo https://github.com/kubernetes-sigs/kustomize/tree/master/kyaml. Would like to see a link to the planned semantics of the merging algorithm. I hope that exists already somewhere. We are defining an API here with component configs. So we should rather have a good picture about the algorithm and not define it by implementation of a helper library of kustomize.

mtaufen · 2021-01-26T17:56:12Z

@sttts we're happy to help improve the docs. Can you expand on exactly what you'd hope to see documented? I want to make sure it's clear but also avoid going overboard.

sttts · 2021-02-09T11:46:16Z

@mtaufen it's not about documentation. It's about specification. We should not build APIs by implementation.

fejta-bot · 2021-05-10T12:16:57Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fejta-bot · 2021-06-09T12:45:20Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

fejta-bot · 2021-07-09T13:41:55Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

k8s-ci-robot · 2021-07-09T13:42:01Z

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fabriziopandini · 2021-07-10T14:46:17Z

😞

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 8, 2020

k8s-ci-robot requested review from justinsb and timothysc January 8, 2020 21:44

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jan 8, 2020

k8s-ci-robot assigned liggitt and sttts Jan 8, 2020

k8s-ci-robot requested review from luxas and stealthybox January 8, 2020 21:45

mtaufen force-pushed the instance-specific branch from 50f8b89 to 351bf2e Compare January 8, 2020 21:47

k8s-ci-robot requested review from neolit123 and rosti January 8, 2020 21:47

neolit123 reviewed Jan 9, 2020

View reviewed changes

sttts reviewed Jan 9, 2020

View reviewed changes

obitech reviewed Jan 9, 2020

View reviewed changes

keps/sig-cluster-lifecycle/wgs/20200108-instance-specific-componentconfig.md Outdated Show resolved Hide resolved

rosti reviewed Jan 9, 2020

View reviewed changes

KEP: Instance-specific ComponentConfig

2ebb072

Adds a KEP for solving the instance-specific config field problem in ComponentConfig APIs. This is a blocker to the overall ComponentConfig effort.

mtaufen force-pushed the instance-specific branch from 351bf2e to 2ebb072 Compare January 10, 2020 21:06

neolit123 added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. wg/component-standard Categorizes an issue or PR as relevant to WG Component Standard. labels Jan 15, 2020

fabriziopandini mentioned this pull request Jan 22, 2020

Allow usage of KubeletConfiguration next to JoinConfiguration on kubeadm join kubernetes/kubeadm#1682

Closed

4 tasks

rosti mentioned this pull request Feb 3, 2020

[WIP] kubeadm: Carry over new component config versions kubernetes/kubernetes#86075

Closed

knabben force-pushed the instance-specific branch 2 times, most recently from 1425fb6 to 3079bc2 Compare December 5, 2020 19:31

ComponentConfig Strategic Merge Patch usage proposal

61d6935

knabben force-pushed the instance-specific branch from 3079bc2 to 61d6935 Compare December 8, 2020 18:36

sttts reviewed Dec 11, 2020

View reviewed changes

fabriziopandini mentioned this pull request Dec 18, 2020

[Roadmap] Support node specific configuration kubernetes/kubeadm#2367

Open

This was referenced Apr 15, 2021

kube-proxy flags are not respected when using config file together kubernetes/kubernetes#98302

Closed

Deprecate kube-proxy flags kubernetes/kubernetes#101166

Closed

hanlins mentioned this pull request Apr 20, 2021

Makes kube-proxy flags take precedence on proxy config kubernetes/kubernetes#101273

Closed

knabben mentioned this pull request Apr 25, 2021

Migrate kube-proxy flags to config kubernetes/kubernetes#86024

Closed

7 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 10, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 9, 2021

k8s-ci-robot closed this Jul 9, 2021

adisky mentioned this pull request Aug 20, 2021

Move Control Planes taint to kubelet config instead of markcontrolplane phase kubernetes/kubeadm#1621

Open

chendave mentioned this pull request Aug 22, 2022

add UpgradeConfiguration/ResetConfiguration API types kubernetes/kubeadm#2489

Closed

KEP: Instance-specific ComponentConfig #1439

KEP: Instance-specific ComponentConfig #1439

Conversation

mtaufen commented Jan 8, 2020

mtaufen commented Jan 8, 2020

neolit123 left a comment

Choose a reason for hiding this comment

neolit123 Jan 9, 2020 • edited Loading

Choose a reason for hiding this comment

obitech Jan 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neolit123 commented Jan 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosti left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtaufen commented Jan 14, 2020

sttts commented Jan 15, 2020

liggitt commented Jan 20, 2020

sttts Dec 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knabben commented Dec 21, 2020 • edited Loading

apelisse commented Jan 6, 2021

knabben commented Jan 19, 2021

sttts commented Jan 20, 2021

mtaufen commented Jan 26, 2021

sttts commented Feb 9, 2021

fejta-bot commented May 10, 2021

fejta-bot commented Jun 9, 2021

fejta-bot commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

fabriziopandini commented Jul 10, 2021

neolit123 Jan 9, 2020 •

edited

Loading

obitech Jan 9, 2020 •

edited

Loading

sttts Dec 11, 2020 •

edited

Loading

knabben commented Dec 21, 2020 •

edited

Loading