You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Created based on incorrect behavior noticed when trying to reproduce rancher/rke2#5949. This was also reported in rancher/rke2#2101 but I failed to properly investigate the behavior.
This cannot be reproduced on K3s; it only affects RKE2 where the apiserver is not colocated with the supervisor.
If the server successfully returns agent config, but fails to generate agent certificates, config.get will be retried, which results in proxy.SetAPIServerPort being called multiple times:
returnnil, errors.Wrapf(err, "failed to setup access to API Server port %d on at %s", controlConfig.HTTPSPort, proxy.SupervisorURL())
}
}
When called multiple times, the agent will actually bypass the loadbalancer, and instead use the server directly. This is caused by a bug in SetAPIServerPort:
During the first call, p.apiServerURL is temporarily set to the default server URL, but then because p.lbEnabled && p.apiServerLB == nil is true, p.apiServerURL is set to the LoadBalancer address. On subsequent calls p.apiServerLB is not nil, so the temporary assignment is left in place, which causes the kubeconfig for various components to be generated pointing directly at the server URL, instead of the loadbalancer URL.
This can be seen with some additional debug logging:
INFO[0000] Starting rke2 agent v1.30.1+dev.d40e03c0 (d40e03c0b9a2ad9bd56d147272567c280278cf06)
INFO[0000] Adding server to load balancer rke2-agent-load-balancer: 172.17.0.8:9345
INFO[0000] Running load balancer rke2-agent-load-balancer 127.0.0.1:6444 -> [172.17.0.8:9345] [default: 172.17.0.8:9345]
DEBU[0000] Supervisor proxy started with supervisor=https://127.0.0.1:6444 apiserver=https://127.0.0.1:6444 lb=true
WARN[0000] Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation.
INFO[0000] Adding server to load balancer rke2-api-server-agent-load-balancer: 172.17.0.8:6443
INFO[0000] Running load balancer rke2-api-server-agent-load-balancer 127.0.0.1:6443 -> [172.17.0.8:6443] [default: 172.17.0.8:6443]
DEBU[0000] Supervisor proxy apiserver port changed; apiserver=https://127.0.0.1:6443 lb=true
INFO[0000] Waiting to retrieve agent configuration; server is not ready: get /var/lib/rancher/rke2/agent/serving-kubelet.crt: https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt: 503 Service Unavailable
DEBU[0007] Supervisor proxy apiserver port changed; apiserver=https://172.17.0.8:6443 lb=true
INFO[0007] Waiting to retrieve agent configuration; server is not ready: get /var/lib/rancher/rke2/agent/serving-kubelet.crt: https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt: 503 Service Unavailable
DEBU[0015] Supervisor proxy apiserver port changed; apiserver=https://172.17.0.8:6443 lb=true
INFO[0015] Waiting to retrieve agent configuration; server is not ready: get /var/lib/rancher/rke2/agent/serving-kubelet.crt: https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt: 503 Service Unavailable
DEBU[0021] Supervisor proxy apiserver port changed; apiserver=https://172.17.0.8:6443 lb=true
INFO[0022] Waiting to retrieve agent configuration; server is not ready: get /var/lib/rancher/rke2/agent/serving-kubelet.crt: https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt: 503 Service Unavailable
DEBU[0027] Supervisor proxy apiserver port changed; apiserver=https://172.17.0.8:6443 lb=true
INFO[0029] Using private registry config file at /etc/rancher/rke2/registries.yaml
The text was updated successfully, but these errors were encountered:
Created based on incorrect behavior noticed when trying to reproduce rancher/rke2#5949. This was also reported in rancher/rke2#2101 but I failed to properly investigate the behavior.
This cannot be reproduced on K3s; it only affects RKE2 where the apiserver is not colocated with the supervisor.
If the server successfully returns agent config, but fails to generate agent certificates,
config.get
will be retried, which results inproxy.SetAPIServerPort
being called multiple times:k3s/pkg/agent/config/config.go
Lines 375 to 380 in 79ba10f
When called multiple times, the agent will actually bypass the loadbalancer, and instead use the server directly. This is caused by a bug in SetAPIServerPort:
k3s/pkg/agent/proxy/apiproxy.go
Lines 134 to 156 in 79ba10f
During the first call,
p.apiServerURL
is temporarily set to the default server URL, but then becausep.lbEnabled && p.apiServerLB == nil
is true,p.apiServerURL
is set to the LoadBalancer address. On subsequent callsp.apiServerLB
is not nil, so the temporary assignment is left in place, which causes the kubeconfig for various components to be generated pointing directly at the server URL, instead of the loadbalancer URL.This can be seen with some additional debug logging:
The text was updated successfully, but these errors were encountered: