Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically renew machine certificates #6529

Closed
sbueringer opened this issue May 19, 2022 · 6 comments · Fixed by #6983
Closed

Automatically renew machine certificates #6529

sbueringer opened this issue May 19, 2022 · 6 comments · Fixed by #6983
Assignees
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management area/security Issues or PRs related to security kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@sbueringer
Copy link
Member

sbueringer commented May 19, 2022

User Story

As a user I would like to be able to configure ClusterAPI to automatically renew (machine) certificates.

Detailed Description

Today during machine bootstrap we create certificates via kubeadm with a 1 year expiry (e.g. apiserver serving certificates).
The goal of this issue is to make this expiry visible and to provide a way to automatically recreate machines before the certificates expire.

Anything else you would like to add:

Notes:

  • I'll follow-up soon with more details and ideas for implementation

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label May 19, 2022
@sbueringer sbueringer changed the title Automatically renew certificates Automatically renew machine certificates May 19, 2022
@sbueringer sbueringer added area/control-plane Issues or PRs related to control-plane lifecycle management area/security Issues or PRs related to security labels May 19, 2022
@neolit123
Copy link
Member

Following the original CAPI machine immutability principle machines can roll out with new certs. ..Yet there have been many talks about allowing in place upgrades on bare metal machines and similar.

Certificate rotation can be also considered as mutable operation for bare metal machines that cannot easily roll out. I don't fully know the scope of the CAPI operator but the in place rotation might be a FR for it.

@fabriziopandini
Copy link
Member

AFAIK the scope of this iteration is renewing machine certs trough machine rotation (no in-place mutations)

@enxebre
Copy link
Member

enxebre commented May 24, 2022

@sbueringer Does this issue refers to kubelet certs as well? If so it relates to #6317. In which case I'd expect us to design a holistic solution that would consider serving/client certs and creation/renewal.

@sbueringer
Copy link
Member Author

sbueringer commented Jul 13, 2022

@sbueringer Does this issue refers to kubelet certs as well? If so it relates to #6317. In which case I'd expect us to design a holistic solution that would consider serving/client certs and creation/renewal.

@enxebre Sorry I didn't have the time to add more details to the issue when I created it.

Goal is specifically to renew the certificates/kubeconfigs on control plane nodes (i.e. the ones that kubeadm can renew as well: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#check-certificate-expiration)

kubelet client certificates are already automatically rotated. kubelet serving certificates should be solved via the Kubelet Authentication CAEP / #6317. So I think we can and should treat those as separate topics.

Additional info: the kubelet serving certificates are not and cannot be validated today by the apiserver as they are self-signed, so we don't have to renew them.

(see below the full issue description / proposed solution)

@sbueringer
Copy link
Member Author

sbueringer commented Jul 13, 2022

Background information

Goal of this issue is to rotate certificates and kubeconfigs of the control plane nodes.

Proposed solution

  1. KubeadmConfig reconciler surfaces expiry date (now+1 year) on first reconcile of a KubeadmConfig via machine.cluster.x-k8s.io/certificates-expiry-date annotation on control plane Machines
  2. Based on the annotation the Machine reconciler sets the expiry date in Machine.status.certificatesExpiryDate
  3. MachineHealthCheck detects that certificates run out based on Machine.status.certificatesExpiryDate and MachineHealthCheck.spec.certificatesMinExpiryDuration and triggers a Machine rollout

Notes

  • Let's add a CERTIFICATES EXPIRY DATE column to Machine.
  • We should add the new MachineHealthCheck field to MachineHealthCheckClass as well (for ClusterClass support).
  • We will only support this feature for new Machines, i.e. for pre-existing clusters a rollout of the control plane is needed to "activate" the feature.
  • Let's document the feature in the book.

Follow-up

@chrischdi
Copy link
Member

chrischdi commented Jul 13, 2022

Regarding

If Machine.status.certificatesExpiryDate is of type string and in format time.RFC3339 (which it will be if it is a metav1.Time) then kube-state-metrics (next release already, but also after my PR) will be able to expose a metric for it using a configuration file 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management area/security Issues or PRs related to security kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
7 participants