-
Notifications
You must be signed in to change notification settings - Fork 224
pkg/asset: use service accounts for checkpointer and kube-proxy creds #767
pkg/asset: use service accounts for checkpointer and kube-proxy creds #767
Conversation
Can you add a description of why this is needed? |
@aaronlevy added a description. |
We don't have any hard rules around this, but I'd aim to put this in a v0.9.0 followup to v0.8.2 since there is a new render asset. |
Can we have kube-proxy and checkpointer each use their own named service account instead of a |
@dghubble updated with RBAC roles. |
pkg/asset/internal/templates.go
Outdated
`) | ||
|
||
var CheckpointerRole = []byte(`apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if the pod-checkpointer only watches kube-system (e.g. its own namespace) we can drop this to a Role. It doesn't currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok to keep for now but add to the checkpointer README and the release note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly just questions
pkg/asset/internal/templates.go
Outdated
|
||
// ServiceAccountKubeConfig is used for pods that don't have access to the | ||
// KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT environment variables. | ||
// This is used for kube-proxy, since it configures these endpoints itself, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its also being used for the pod-checkpointer here. Whats the reason there?
pkg/asset/internal/templates.go
Outdated
namespace: kube-system | ||
`) | ||
|
||
// ServiceAccountKubeConfig is used for pods that don't have access to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify "used for"? This still feels odd and kinda opaque. So before we relied on the host's kubeconfig which contained both endpoint info and credentials. With the service account approach you use an alternate kubeconfig which specifies the endpoint info and the local service account in the pod.
What's the tie in to these environment variables?
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.3.0.1:443
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clients compiled with client-go require the KUBERNETES_SERVICE_PORT
and KUBERNETES_PORT
to use the in-cluster config.
https://github.com/kubernetes/client-go/blob/kubernetes-1.8.2/rest/config.go#L318-L321
These aren't exposed to kube-proxy (since kube proxy sets up the endpoint) or the checkpointer, since it runs as a static pod.
This work:
// ServiceAccountKubeConfig instructs clients to use their service account token, but unlike an in-cluster client
// doesn't rely on the `KUBERNETES_SERVICE_PORT` and `KUBERNETES_PORT` to determine the API
// servers address. This kubeconfig is used bootstrapping pods that might not have access to these env vars,
// such as kube-proxy, which sets up the endpoint (chicken and egg), and the checkpointer, which needs to run
// as a static pod even if the API server isn't available.
pkg/asset/asset.go
Outdated
@@ -37,6 +37,9 @@ const ( | |||
AssetPathManifests = "manifests" | |||
AssetPathKubelet = "manifests/kubelet.yaml" | |||
AssetPathProxy = "manifests/kube-proxy.yaml" | |||
AssetPathProxySA = "manifests/kube-proxy-sa.yaml" | |||
AssetPathProxyRoleBinding = "manifests/kube-proxy-role-binding.yaml" | |||
AssetPathServiceAccountKubeConfig = "manifests/service-account-kubeconfig.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: everywhere else, we say the name followed by the type. KubeconfigServiceAccount
and kubeconfig-sa.yaml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below, got confused. Its not a service account.
pkg/asset/internal/templates.go
Outdated
var ServiceAccountKubeConfigTemplate = []byte(`apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: service-account-kubeconfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: kubeconfig-service-account or just kubeconfig
pkg/asset/internal/templates.go
Outdated
metadata: | ||
name: service-account-kubeconfig | ||
namespace: kube-system | ||
stringData: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I missed this getting into upstream.
pkg/asset/internal/templates.go
Outdated
- name: etc-kubernetes | ||
hostPath: | ||
path: /etc/kubernetes | ||
name: ssl-certs-host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accidental duplicate from a few lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thanks for catching.
@dghubble thanks for the review. Comments addressed. |
pkg/asset/asset.go
Outdated
@@ -37,6 +37,9 @@ const ( | |||
AssetPathManifests = "manifests" | |||
AssetPathKubelet = "manifests/kubelet.yaml" | |||
AssetPathProxy = "manifests/kube-proxy.yaml" | |||
AssetPathProxySA = "manifests/kube-proxy-sa.yaml" | |||
AssetPathProxyRoleBinding = "manifests/kube-proxy-role-binding.yaml" | |||
AssetPathServiceAccountKubeConfig = "manifests/kubeconfig-sa.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my nit doesn't help much. Its still confusing to read since this is actually a secret. eh, better idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in-cluster-kubeconfig? which happens to be a secret or configmap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does manifest/kubeconfig-in-cluster.yaml
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
e2e passed but it looks like there's something wrong with the Jenkins GitHub plugin https://gist.github.com/ericchiang/1b80ef7c23ed4adf27527e728e04e2d3 |
Also, |
/assign @diegs For final merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nice work! Just a few small comments but I'm good to merge.
pkg/asset/internal/templates.go
Outdated
`) | ||
|
||
var CheckpointerRole = []byte(`apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok to keep for now but add to the checkpointer README and the release note.
pkg/asset/k8s.go
Outdated
@@ -40,6 +41,9 @@ func newStaticAssets(imageVersions ImageVersions) Assets { | |||
MustCreateAssetFromTemplate(AssetPathControllerManagerDisruption, internal.ControllerManagerDisruptionTemplate, conf), | |||
MustCreateAssetFromTemplate(AssetPathKubeDNSDeployment, internal.DNSDeploymentTemplate, conf), | |||
MustCreateAssetFromTemplate(AssetPathCheckpointer, internal.CheckpointerTemplate, conf), | |||
Asset{AssetPathCheckpointerSA, internal.CheckpointerServiceAccount}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be more uniform (and less surprising in the future) to pass these through MustCreateAssetFromTemplate
even if they currently do not do any templating.
coreosbot run e2e |
Last couple nits addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
`) | ||
|
||
// TODO: Drop checkpointer RBAC resources to a Role and RoleBinding if | ||
// the checkpoint switches to only watching kube-system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just make this change? It only needs to watch kube-system (also can be follow-up PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can open a separate PR. We'd have to wait the checkpointer change to merged before we drop this to a role, so we can't do it in one PR anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the checkpointer have to be changed? It just wouldn't have access to non-kube-system right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkpointer attempts to watch all namespaces. If we restrict it now that request will fail. We have to submit a change to the checkpointer.
https://github.com/kubernetes-incubator/bootkube/blob/v0.8.2/pkg/checkpoint/apiserver.go#L18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened the checkpointer PR #774
* Create separate service accounts for kube-proxy and pod-checkpointer * Switch kube-proxy and pod-checkpointer to use a kubeconfig that references the local service account, rather than the host kubeconfig * kubernetes-retired/bootkube#767
// (chicken and egg), and the checkpointer, which needs to run as a static pod | ||
// even if the API server isn't available. | ||
var KubeConfigInClusterTemplate = []byte(`apiVersion: v1 | ||
kind: Secret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be a secret? There is no sensitive data in this resource, would you consider merging PR which changes it to ConfigMap?
- name: local | ||
cluster: | ||
server: {{ .Server }} | ||
certificate-authority-data: {{ .CACert }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another options is to use certificate-authority: /run/secrets/kubernetes.io/serviceaccount/ca.crt
which makes it somewhat easier to rotate keys in the future (assuming that resources installed by bootkube are forming a basis for core cluster resources, which are then managed separately)
Today, both the checkpointer and kube-proxy re-use the kubelet's credentials to talk to the API server. This means we can't target them with specific RBAC profiles or identify them in audit logs. In addition, when we move to TLS bootstrapping (#663), the kubelet's kubeconfig will no longer hold real credentials.
Switch the checkpointer and kube-proxy to use service account credentials instead of the kubelet's kubeconfig.
Extracted from #663 to make that PR smaller.
cc @dghubble @rphillips