Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster CA certificate is not trusted #85

Closed
rdlaitila opened this issue Feb 22, 2022 · 7 comments
Closed

Cluster CA certificate is not trusted #85

rdlaitila opened this issue Feb 22, 2022 · 7 comments
Labels
kind/bug Something isn't working size/S

Comments

@rdlaitila
Copy link

module version: v3.1.0
k3s version: v1.23.3+k3s1

When setting cluster_domain subsequent server nodes fail to join the cluster with error:

Feb 22 11:56:41 n02 k3s[3988]: time="2022-02-22T11:56:41-05:00" level=info msg="Starting k3s v1.23.3+k3s1 (5fb370e5)"
Feb 22 11:56:41 n02 k3s[3988]: time="2022-02-22T11:56:41-05:00" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
Feb 22 11:56:41 n02 k3s[3988]: time="2022-02-22T11:56:41-05:00" level=info msg="Managed etcd cluster not yet initialized"
Feb 22 11:56:41 n02 k3s[3988]: time="2022-02-22T11:56:41-05:00" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
Feb 22 11:56:41 n02 k3s[3988]: time="2022-02-22T11:56:41-05:00" level=fatal msg="starting kubernetes: preparing server: failed to validate server configuration: critical configuration value mismatch"

I've tried to set tls-san with additional global flags to see if it helps but it does not:

"--tls-san ${var.my_cluster_domain}",
"--tls-san cluster.local",

If I remove the cluster_domain from the module altogether the cluster successfully bootstraps with default cluster domain cluster.local

Is there something I'm missing? Thanks!

@xunleii
Copy link
Owner

xunleii commented Feb 22, 2022

Hi @rdlaitila, thanks for your question.

I didn't used it with the last release on k3s; if it possible, could you test it with v1.21.x ? Something probably changed since the last time I used it.

EDIT: I didn't have time this week to test it, but I will try to check this issue this weekend

@rdlaitila
Copy link
Author

@xunleii Thanks for the reply

I did try with v1.20.14+k3s2 and v1.21.9+k3s and experienced the same issue. I've been poking at the cert generation code and don't see any glaring issues, will continue to poke as best I can. Thanks for taking a peek!

@Meallia
Copy link

Meallia commented Feb 26, 2022

I'd suggest trying with "'\\--tls-san ${var.my_cluster_domain}'"

I got a similar problem where the first global flag was ignored for some obscure reason ( the first hyphen was not read by the k3s-agent command) and adding the quotes and escaping the first hyphen fixed it.

@jsbrain
Copy link

jsbrain commented Mar 25, 2022

This is a real issue, for me specifying:

  global_flags = [
    "--tls-san ${var.my_cluster_ip}"
  ]

results in the first hyphen to be removed as checking the journalctl -u k3s-agent.service -n 100 results in:

...
Incorrect Usage: flag provided but not defined: -tls-san

This is super weird, I checked the code in the repo, even "logged" out some commands but couldn't find the reason why it would just remove one - just like that.

Super weird, if it wouldn't be for this issue I probably would've gone crazy.

Thank you @Meallia, your workaround works great but this should really get fixed.

@Meallia
Copy link

Meallia commented Mar 26, 2022

This is super weird, I checked the code in the repo, even "logged" out some commands but couldn't find the reason why it would just remove one - just like that.

I also checked the code and I'm rather confident the issue is not related to this repo but comes from some strange interaction between k3s and systemd.

I think this is the issue that made me try this workaround : k3s-io/k3s#1125

@xunleii
Copy link
Owner

xunleii commented Apr 24, 2022

Sorry for this long absence, I didn't have much time these last months.

Thanks @Meallia for your workaround. I will create a PR to add it.

EDIT: after reading k3s-io/k3s#1125 (comment), adding \ before the flag seems to ignore the flag. Also, I have tried on my end with an Ubuntu 20.04 but I can't seem to reproduce this issue. What OS are you using to host k3s ?
Another weird thing is that only the first global_flags is ignored, but was not the first flag in the systemd service (--node-ip should be the first one). I will do testing to see if it could be related to Terraform and how this module uses HCL. If you encounter this problem again, can you provide me the OS of the k3s host, the module version and your Terraform version ? Thanks for your help

@xunleii xunleii added size/S kind/bug Something isn't working labels Apr 24, 2022
@xunleii
Copy link
Owner

xunleii commented Oct 22, 2023

Further to my last reply, I am closing this file.
Please do not hesitate to reopen it if you encounter this problem again.

@xunleii xunleii closed this as completed Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working size/S
Projects
None yet
Development

No branches or pull requests

4 participants