Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to set up pod network because token expired due to bound service account token #852

Closed
snowmansora opened this issue May 25, 2022 · 14 comments
Labels

Comments

@snowmansora
Copy link

What happend:
Pod stuck at ContainerCreating and following unauthorized error is shown when describing it:

  Warning  FailedCreatePodSandBox  9m38s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "779db7fa6529acbdf3db228bc81a1f11e070e31c1d15354645e79f04631cb663" network for pod "ops-centos-8ff66846f-hff26": networkPlugin cni failed to set up pod "ops-centos-8ff66846f-hff26_default" network: Multus: [default/ops-centos-8ff66846f-hff26]: error getting pod: Unauthorized, failed to clean up sandbox container "779db7fa6529acbdf3db228bc81a1f11e070e31c1d15354645e79f04631cb663" network for pod "ops-centos-8ff66846f-hff26": networkPlugin cni failed to teardown pod "ops-centos-8ff66846f-hff26_default" network: Multus: [default/ops-centos-8ff66846f-hff26]: error getting pod: Unauthorized]

What you expected to happen:
Multus set up the pod network successfully and pod can run.

Anything else we need to know?:
Bound service account token is turned on by default in K8s 1.21: https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md
In previous versions of K8s, service account token doesn't has an expiration.
In K8s 1.21, the token expires after 1 year or expires after 1 hour if service-account-extend-token-expiration is false.
Looking at Multus source code, I think the token will never be updated after the Multus pod finished its initial setup in entrypoint.sh.
Restarting the multus pod will fix the problem, because a new token will be used.

How to reproduce it (as minimally and precisely as possible):
In a K8s with API server argument service-account-extend-token-expiration set to false.
After multus pod has been running for an hour, create a new pod, and the pod will stuck at ContainerCreating.

Environment:

  • Multus version: ghcr.io/k8snetworkplumbingwg/multus-cni:v3.8
  • Kubernetes version (use kubectl version): 1.21.5
  • Primary CNI for Kubernetes cluster: Flannel
  • OS (e.g. from /etc/os-release): CentOS 7
@snowmansora
Copy link
Author

Also, I notice there was an attempt to refresh the token, https://github.com/k8snetworkplumbingwg/multus-cni/pull/686/files, but that was never completed/merged-to-master.

Wondering is there any plan to fix the issue, now that there is a need to refresh the token or else Multus can potentially fail after an hour of running when service-account-extend-token-expiration is set to false.

Thanks.

@Alan01252
Copy link

aws/amazon-vpc-cni-k8s#1868 (comment)

We've also experienced this and it caused much confusion for a while.

@dougbtv
Copy link
Member

dougbtv commented Jun 23, 2022

Yeah, this is definitely an issue that can occur!

Let me try to ressurect #686 and see if I can get that in there.

In the feature/multus-4.0 branch, we have the "thick plugin" architecture which will account for this with the in-pod kube auth, but we should have it work in the current version when certs are rotated.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@JornShen
Copy link

JornShen commented Oct 12, 2022

I see this MR #686 has been closed.
And do you plan to merge it to the master ? for these time,multus 4.0 has not release version
@dougbtv

@infinitydon
Copy link

@dougbtv - Any update on which version has this fix? This is also affecting whereabouts.

@xagent003
Copy link

Bump... any ideas why this PR was closed? #686

@dougbtv dougbtv reopened this Dec 21, 2022
@dougbtv
Copy link
Member

dougbtv commented Dec 21, 2022

Yeah, it should be open, it just went auto stale.

@dougbtv
Copy link
Member

dougbtv commented Dec 21, 2022

Also, for what it's worth, I think with the thick plugin architecture, it shouldn't be such a big deal, that is, it should be using the service account token in the pod, and not a generated kubeconfig that resides on disk, so... It should just use the updated token.

@caribbeantiger
Copy link

Also, for what it's worth, I think with the thick plugin architecture, it shouldn't be such a big deal, that is, it should be using the service account token in the pod, and not a generated kubeconfig that resides on disk, so... It should just use the updated token.

has anybody tried Thick plugin architecture in EKS already ?

@github-actions github-actions bot removed the Stale label Dec 22, 2022
@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@MarkSpencerTan
Copy link

Bumping this again since it seems like it went stale. Any updates?

@infinitydon
Copy link

For EKS, thick-plugin is now available, so far the token issue seems to be resolved using it.

https://github.com/aws/amazon-vpc-cni-k8s/blob/master/config/multus/v4.0.2-eksbuild.1/multus-daemonset-thick.yml

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 10, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants