-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS Handshake error with vault agent injector #275
Comments
Did you get this sorted? |
Hi, Any update on this? We confirm it's happening with 0.16.0 deployed in AKS 1.21. Thanks! |
Having the same issue, vault 1.10 k8s 1.22.6 (RKE2). |
The auto TLS certificate regenerates every 24 hours, which sounds like it probably related to the problem. I'm having trouble reproducing this, even on the helm chart version 0.16.0. Are there any other steps you use to see this issue? What I am doing:
helm install vault vault --repo https://helm.releases.hashicorp.com \
--version=0.16.0 \
--set server.enabled=false \
--set injector.enabled=true \
--set "injector.externalVaultAddr=http://192.168.65.2:8200"
But I never see the failure mentioned after the certificate is refreshed. |
We are seeing the same on the clusters we've upgraded, however it seems less frequent on the two clusters where we have a 1 minute cronjob continuously deploying and deleting a pod that injects a secret. Running on GKE 1.21 Deployment is effectively:
I'm going to perform some tests on a K3D cluster to see if there's a pattern. |
Sorry @swenson for not being precise. When I said 0.16.0 I was talking about the vault-k8s version. The configuration that failed was Helm Chart vault-0.18.0 with Vault 1.9.6 and injector 0.16.0. This combination was failing in two different AKS clusters running K8s 1.21. After we downgraded injector to 0.15.0, the error seems to be gone in the both clusters (at least, during the last week). Thanks! |
We have the same issue and have had to revert back to 0.15.0. |
Managed to re-create this using k3d cluster locally. I created a cluster and deployed Vault+Vault-Agent-Injector Set up a cronjob pulling Vault secrets running every minute for over 24 hours, no issue. I stopped the cronjob, and noted the time that the certificate was last updated (15:58 UTC on the 22nd) - waited until ~16:57 UTC on the 23rd (about 5 minutes ago) and ran a job from my cronjob. |
Once the `time.NewTimer()` expires, calls to `timer.Stop()` will return `false`, but the channel will have nothing in it, causing `<-timer.C` to hang forever. This is hinted at by the docs, even though they suggest `timer.Stop()` should return true in that case. We change to a non-blocking drain so that we won't block forever. This manifests in never updating the certificate after it expires, causing TLS handshake errors. Fixes #275
Once the `time.NewTimer()` expires, calls to `timer.Stop()` will return `false`, but the channel will have nothing in it, causing `<-timer.C` to hang forever. This is hinted at by the docs, even though they suggest `timer.Stop()` should return true in that case. We change to a non-blocking drain so that we won't block forever. This manifests in never updating the certificate after it expires, causing TLS handshake errors. Fixes #275
I believe we have found the underlying cause for this and fixed it in the last few PRs. I think we'll cut a new release of |
Thank you! |
can confirm I also had the problems described above and that downgrading to 0.15 worked. |
Still experiencing this problem: Vault agent injector throws error 'tls: bad certificate' after each 24 hours. @swenson In which version did you fix this bug? |
I am running the vault agent injector with auto tls enabled and configured an external vault server that is running on my host.
Everything was working fine, suddenly after 24 hours, I am getting this bad certificate issue.
I have even tried using
vault.hashicorp.com/tls-skip-verify
annotation but the result is the same.These are the agent injector logs.
The text was updated successfully, but these errors were encountered: