Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate how service-account-issuer work #1766

Closed
AverageMarcus opened this issue Dec 9, 2022 · 11 comments
Closed

Investigate how service-account-issuer work #1766

AverageMarcus opened this issue Dec 9, 2022 · 11 comments
Assignees
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/spike provider/cluster-api-aws Cluster API based running on AWS topic/capi

Comments

@AverageMarcus
Copy link
Member

AverageMarcus commented Dec 9, 2022

Need to investigate exactly how the service-account-issuer flag works in api-server when using a custom URL.

Upgrades to CAPA clusters are very unstable due to pods becoming unauthorized during rolling of controlplane nodes. We need to understand why the existing service account tokens become invalid during this process even though the URL used for the issuer doesn't change.

@AverageMarcus AverageMarcus self-assigned this Dec 9, 2022
@alex-dabija alex-dabija added area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service topic/capi provider/cluster-api-aws Cluster API based running on AWS kind/spike labels Dec 12, 2022
@AverageMarcus
Copy link
Member Author

AverageMarcus commented Dec 12, 2022

Attempts to replicate the upgrade issue using a Workload Cluster have failed.

Using as similar values as possible, I've tried the same upgrade from 0.18.0 to 0.20.2 that caused the issues last week with the MC and noticed a couple differences:

  1. No "bearer token" errors in the apiserver logs, everything continues working
  2. The worker nodes also roll. When I did the upgrades to the MCs last week only the control plane nodes rolled.

This makes me think the issue may be related to some TTL expiring or similar so have created several test WCs to leave for a few days before attempting the same.


EDIT: The worker nodes rolling is because I mistakenly upgraded to 0.20.2 (which includes the change to ubuntu version) whereas last week I performed an upgrade from 0.18.0 -> 0.20.1. Performing the same upgrade on a WC does just roll the control plane nodes as expected.

@AverageMarcus
Copy link
Member Author

Some implementation details discovered while debugging:

  • The Token Controller, part of kube-controller-manager, acts on ServiceAccounts and secrets - watching for deletions of each and cleaning up the references in the other. This does not manage generating / refreshing of tokens.
  • The ServiceAccount addmission controller is only responsible for ensuring a pod has the appropriate ServiceAccount-related fields populated on creation (e.g. .spec.serviceAccountName and handling the token volume mount)
  • The tokens are generated by calling the TokenRequest API automatically on pod launch thanks to the serviceAccountToken projected volume set up by the kubelet. This API creates tokens that expire either when the pod is deleted or after a defined lifespan that is defaulted to 1 hour.
  • Tokens can be manually requested using the kubectl create token command.
  • Kubelet is responsible for the projected token and will perform the following actions:
    • request and store the token on behalf of the Pod
    • make the token available to the Pod at a configurable file path
    • refresh the token as it approaches expiration
  • Kubelet will proactively rotate the token if over 80% of its TTL

Based on the above, I believe the issue lies in the kubelet. Unfortunately we don't have logs persisted from the kubelet so can't investigate until we can replicate the issue.
When we next perform an MC upgrade we should tail the kubelet logs on each of the (old and new) control plane nodes to look out for relevant error messages.

@alex-dabija alex-dabija changed the title Investigate how service-account-issuer work. Investigate how service-account-issuer work Dec 13, 2022
@AverageMarcus
Copy link
Member Author

The current kubelet logs available on both the tornado MC (where the issue presented last week but contains new control plane nodes) and the keep1 WC I created yesterday don't show anything out of the ordinary.

@AverageMarcus
Copy link
Member Author

Test case keep1:

Cluster age: 22h
Action performed: Upgrade App from 0.18.0 to 0.20.1
Upgraded cleanly: Yes

Notes:
The apiserver logs contained a small number of authentication logs (below) but they were temporary and it doesn't seem to have effected the health of the cluster at all.

kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:38.634005       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:41.378151       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:43.636165       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:46.380304       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"

While this error is similar to what was seen during the previous MC upgrades it wasn't printed anywhere near as often or as many times and the error includes an additional note of service account token has been invalidated.

@AverageMarcus
Copy link
Member Author

Test case keep2:

Cluster age: 2d14h
Action performed: Upgrade App from 0.18.0 to 0.20.1
Upgraded cleanly: Yes

Notes:

Some similar errors in the logs to keep1 but nothing persistent.

kube-apiserver-ip-10-0-80-61.eu-west-2.compute.internal kube-apiserver E1215 08:03:37.736541       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-80-61.eu-west-2.compute.internal kube-apiserver E1215 08:03:42.738910       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-78-60.eu-west-2.compute.internal kube-apiserver E1215 08:03:21.578948       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, context deadline exceeded]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:39.771686       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:44.774180       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.800197       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.800999       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.801661       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"

@AverageMarcus
Copy link
Member Author

Test case keep2 - take 2:

Cluster age: 2d14h
Action performed:

  • Manually delete the keep2-sa secret from the MC
  • Upgrade App from 0.20.1 to 0.20.2
    Upgraded cleanly: No

Notes:

This upgrade causes all nodes to cycle, including workers.

Some new logs in this situation:

kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:43.535948       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:43.535934       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:47.776249       1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"

These logs are repeated a lot.

This causes similar issues to what we've seen during upgrades but in a very different way and with different logs.

I think it's safe to say the [cluster]-sa secret didn't change during the faulty upgrades

@AverageMarcus
Copy link
Member Author

We hit this issue again today with the upgrade of goat.

Root cause still unknown but I managed to rule out some possible reasons:

  • The service-account-token secrets don't change during the upgrade so the service accounts still have the correct token assigned to them
  • The goat-ca and goat-sa secrets didn't change so the keys used to generate tokens remains the same

Some differences seen compared to tests performed on workload clusters:

  • The kubelet logs on the first control plane node being removed shows etcd crash looping because it has been removed from the etcd members. I would assume this also happens with WCs but I believe they are deleted quicker so doesn't get to this point.
Dec 16 09:25:01 ip-10-224-128-148 kubelet[1401]: I1216 09:25:01.699484    1401 scope.go:110] "RemoveContainer" containerID="84522cbabffe063f63f45729ccd5fd1e841bc29f640678ecad0ebe3ff448ae3a"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: I1216 09:25:04.707774    1401 scope.go:110] "RemoveContainer" containerID="84522cbabffe063f63f45729ccd5fd1e841bc29f640678ecad0ebe3ff448ae3a"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: I1216 09:25:04.708189    1401 scope.go:110] "RemoveContainer" containerID="1e5118c36253eb247e1a42db466f5c10854c90410a47ada01c9ee5a7821c3459"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: E1216 09:25:04.708633    1401 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 10s restarting failed container=etcd pod=etcd-ip-10-224-128-148.eu-north-1.compute.internal_kube-system(8f8e2f81e748ac114db703b7124de587)\"" pod="kube-system/etcd-ip-10-224-128-148.eu-north-1.compute.internal" podUID=8f8e2f81e748ac114db703b7124de587
Dec 16 09:25:05 ip-10-224-128-148 kubelet[1401]: I1216 09:25:05.711469    1401 scope.go:110] "RemoveContainer" containerID="1e5118c36253eb247e1a42db466f5c10854c90410a47ada01c9ee5a7821c3459"
Dec 16 09:25:05 ip-10-224-128-148 kubelet[1401]: E1216 09:25:05.711814    1401 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 10s restarting failed container=etcd pod=etcd-ip-10-224-128-148.eu-north-1.compute.internal_kube-system(8f8e2f81e748ac114db703b7124de587)\"" pod="kube-system/etcd-ip-10-224-128-148.eu-north-1.compute.internal" podUID=8f8e2f81e748ac114db703b7124de587
  • The syslog from the first new control plane node includes the following:
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:15] [certs] Generating "apiserver-kubelet-client" certificate and key
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Starting the kubelet
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

which is shortly followed by:

Dec 16 09:24:47 ip-10-224-148-177 kubelet[1484]: I1216 09:24:47.034920    1484 kubelet.go:1724] "Trying to delete pod" pod="kube-system/kube-apiserver-ip-10-224-148-177.eu-north-1.compute.internal" podUID=958cc32c-5119-48db-82a8-a03f90f6a89e
Dec 16 09:24:47 ip-10-224-148-177 kubelet[1484]: I1216 09:24:47.060101    1484 kubelet.go:1729] "Deleted mirror pod because it is outdated" pod="kube-system/kube-apiserver-ip-10-224-148-177.eu-north-1.compute.internal"

It looks like kubelet is set up before the apiserver has had the placeholder replaced.
I tried systemctl restart kubelet but that didn't seem to help.

  • There are many errors in the logs of the new control plane node related to not being able to access the api server ELB:
Dec 16 09:25:23 ip-10-224-148-177 kubelet[1484]: E1216 09:25:23.257848    1484 controller.go:187] failed to update lease, error: Put "https://internal-goat-apiserver-18847525199.eu-north-1.elb.amazonaws.com:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-224-148-177.eu-north-1.compute.internal?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)

This is possibly caused by the ELB including non-ready instances (such as the currently initialising node) which can't respond to API requests.

@AverageMarcus
Copy link
Member Author

Test case keep3:

Cluster age: 3d18h
Action performed: Upgrade App from 0.18.0 to 0.20.1
Upgraded cleanly: Yes

Notes:

Checking the last of the possible default TTLs (72hour).

No issues.

@AverageMarcus
Copy link
Member Author

Test case keep4:

Cluster age: 3d20h
Action performed: Upgrade App from 0.18.0 to 0.20.1
Upgraded cleanly: Yes

Notes:

Installed Kyverno, kyverno-policies, keep4-kyverno-policies-connectivity & keep4-kyverno-policies-dx

Kyverno didn't cause any issue with the upgrade.

@AverageMarcus
Copy link
Member Author

AverageMarcus commented Dec 16, 2022

I have been unable to replicate the issue using workload clusters. Whatever is causing the issue seems to only be there on management clusters.

It is also entirely possible that whatever cause the problem is now already fixed in the latest versions of cluster-aws and default-apps-aws. We wont know for sure until we next upgrade an MC.


Plan for testing MC upgrades:

  • Pick a CAPA MC that can be tested against / not currently being actively used by others (maybe grizzly). If no current MCs fit this then we may want to create a new ephemeral MC for testing this issue.
  • Create a new PR to cluster-aws that introduces a change to the control plane without changing any functionality. Easiest way to I can think to ensure this is to add a new empty file to the files property of the KubeadmControlPlane. This will cause the control plane nodes only to be replaced but with functionally identical config.

    See Introduce an empty file to force upgrade of control plane nodes cluster-aws#188

  • Disable Flux on the MC
  • Manually update the cluster App of the MC to be that built for the PR.
  • Wait and see if the problem occurs.
  • Repeat as needed by rolling back the cluster-aws to the gitops tracked version until a root cause is found.

@tuladhar
Copy link

tuladhar commented Jan 9, 2023

We (@tuladhar and @bdehri) upgraded grizzly and golem, and didn't encounter this issue. Closing it for now, but if we encounter the issue in future we'll re-open and investigate further.

@tuladhar tuladhar closed this as completed Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/spike provider/cluster-api-aws Cluster API based running on AWS topic/capi
Projects
None yet
Development

No branches or pull requests

4 participants