Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add service account issuer migration doc #16541

Merged
merged 2 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ For a better viewing experience please check out our live documentation site at
* [`kops` updating](operations/updates_and_upgrades.md#updating-kops)
* [Label management](labels.md)
* for cluster nodes
* [Service Account Issuer (SAI) migration](operations/service_account_issuer_migration.md)
* [Service Account Token Volume Projection](operations/service_account_token_volumes.md)
* [Moving from a Single Master to Multiple HA Masters](single-to-multi-master.md)
* [Upgrading Kubernetes](tutorial/upgrading-kubernetes.md)
Expand Down
2 changes: 2 additions & 0 deletions docs/cluster_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -1478,6 +1478,8 @@ spec:

**Warning**: Enabling the following configuration on an existing cluster can be disruptive due to the control plane provisioning tokens with different issuers. The symptom is that Pods are unable to authenticate to the Kubernetes API. To resolve this, delete Service Account token secrets that exists in the cluster and kill all pods unable to authenticate.

**Note**: You can follow a variation of the procedure documented [here](/operations/service_account_issuer_migration/) to enable IRSA on an existing cluster without disruption.

kOps can publish the Kubernetes service account token issuer and configure AWS to trust it
to authenticate Kubernetes service accounts:

Expand Down
49 changes: 49 additions & 0 deletions docs/operations/service_account_issuer_migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Service Account Issuer (SAI) migration

In the past changing the Service Account Issuer has been a disruptive process. However since Kubernetes v1.22 you can specify multiple Service Account Issuers in the Kubernetes API Server ([Docs here](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#serviceaccount-token-volume-projection)).

As noted in the Kubernetes Docs when the `--service-account-issuer` flag is specified multiple times, the first is used to generate tokens and all are used to determine which issuers are accepted.

So with this feature we can migrate to a new Service Account Issuer without disruption to cluster operations.

**Note**: There is official kOps support for this in the forthcoming feature - [koordinates/kops#16497](https://github.com/kubernetes/kops/pull/16497).

## Migrate using Instancegroup Hooks (prior kOps v1.28+)

**Warning**: This procedure is manual and involves some tricky modification of manifest files. We recommend testing this on a staging cluster before proceeding on a production cluster.

In this example we are switching from `master.[cluster-name].[domain]` to `api.internal.[cluster-name].[domain]`.

1. Add the `modify-kube-api-manifest` (existing SAI as primary) hook to the control-plane instancegroups
```yaml
hooks:
- name: modify-kube-api-manifest
before:
- kubelet.service
manifest: |
User=root
Type=oneshot
ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/i\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
```
2. Apply the changes to the cluster
3. Roll the control-plane nodes
4. Update the `modify-kube-api-manifest` (switch the primary/secondary SAI) hook on the control-plane instancegroups
```yaml
hooks:
- name: modify-kube-api-manifest
before:
- kubelet.service
manifest: |
User=root
Type=oneshot
ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/a\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
```
5. Apply the changes to the cluster
6. Roll the control-plane nodes
7. Roll all other nodes in the cluster
8. Wait 24 hours until the dynamic SA tokens have refreshed
9. Remove the `modify-kube-api-manifest` hook on the control-plane instancegroups
10. Apply the changes to the cluster
11. Roll the control-plane nodes

This procedure was originally posted in a GitHub issue [here](https://github.com/kubernetes/kops/issues/16488#issuecomment-2084325891) with inspiration from [this comment](https://github.com/kubernetes/kops/issues/14201#issuecomment-1732035655).
Loading