Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm join controlplane not pulling images and fails #1341

Closed
den-is opened this issue Jan 4, 2019 · 5 comments · Fixed by kubernetes/kubernetes#72870
Closed

kubeadm join controlplane not pulling images and fails #1341

den-is opened this issue Jan 4, 2019 · 5 comments · Fixed by kubernetes/kubernetes#72870
Labels
area/HA help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor.
Milestone

Comments

@den-is
Copy link

den-is commented Jan 4, 2019

What keywords did you search in kubeadm issues before filing this one?

join preflight image pull abracadabra

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

Versions

kubeadm version: kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:36:44Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version): 1.13.1
  • Cloud provider or hardware configuration: baremetal vm
  • OS : centos 7.6 1810
  • Kernel (e.g. uname -a): Linux k8sm1.dev 3.10.0-957.1.3.el7.x86_64 kubeadm join on slave node fails preflight checks #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

What would you like to be added:
kubeadm join --experimental-control-plane is not pre-pulling control-plane images.
Long story short, it would be nice to have image pull step during preflight phase for control-plane joining process.

Why is this needed:
kubeadm join 1.2.3.4:6443 --token xxx --discovery-token-ca-cert-hash yyy --experimental-control-plane - command fails if node has no pre-pulled images and in general it takes quite long time to pull all that control-plane required big images. Even so, process continues in the "background" and finishes in some time with success. During that time api is not accessible. Lot's of noise occurs in kubelet and kube-apiserver logs (not providing here, only if you request).

root@k8sw1 ~# kubeadm join 10.1.1.42:6443 --token d890my.qocwruoh5o2jv99b --discovery-token-ca-cert-hash sha256:2a45ac8d75e639cecd06e84ab4174ed89185d3873782d11e80cca89ff6a85832 --experimental-control-plane
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "10.1.1.42:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.1.1.42:6443"
[discovery] Requesting info from "https://10.1.1.42:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.1.1.42:6443"
[discovery] Successfully established connection with API Server "10.1.1.42:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[join] Running pre-flight checks before initializing the new control plane instance
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8sw1.dev localhost] and IPs [10.1.1.41 127.0.0.1 ::1]
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8sw1.dev localhost] and IPs [10.1.1.41 127.0.0.1 ::1]
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8sw1.dev kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.1.1.41 10.1.1.42]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Checking Etcd cluster health
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "k8sw1.dev" as an annotation
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet-check] Initial timeout of 40s passed.
error uploading configuration: Get https://10.1.1.42:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config: unexpected EOF

But everything works great when images are pre-pulled with kubeadm config images pull

root@k8sw1 ~# kubeadm join 10.1.1.42:6443 --token 81iitg.yvf55mqejaguje0s --discovery-token-ca-cert-hash sha256:2a45ac8d75e639cecd06e84ab4174ed89185d3873782d11e80cca89ff6a85832 --experimental-control-plane
 [preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "10.1.1.42:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.1.1.42:6443"
[discovery] Requesting info from "https://10.1.1.42:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.1.1.42:6443"
[discovery] Successfully established connection with API Server "10.1.1.42:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[join] Running pre-flight checks before initializing the new control plane instance
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8sw1.dev kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.1.1.41 10.1.1.42]
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8sw1.dev localhost] and IPs [10.1.1.41 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8sw1.dev localhost] and IPs [10.1.1.41 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Checking Etcd cluster health
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "k8sw1.dev" as an annotation
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8sw1.dev as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8sw1.dev as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Master label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.
@neolit123 neolit123 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/HA kind/bug Categorizes issue or PR as related to a bug. labels Jan 4, 2019
@fabriziopandini
Copy link
Member

@MalloZup are you available to work on this issue?
Pre pull should be executed immediately after

[join] Running pre-flight checks before initializing the new control plane instance

Prior art: pre pulling images in the kubeadm init workflow

@MalloZup
Copy link

MalloZup commented Jan 7, 2019

@fabriziopandini thx yop i can have a look! ✈️ 🛩️ 🌞

@fabriziopandini
Copy link
Member

/lifecycle active

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Jan 7, 2019
@timothysc timothysc added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 7, 2019
@timothysc timothysc added this to the v1.14 milestone Jan 7, 2019
@MalloZup
Copy link

MalloZup commented Jan 11, 2019

Do we have a scripted way to setup the 2 cluster in Ha until the point to join? I think I have a fix but I need to test it with this deployment. Tia

@MalloZup
Copy link

@fabriziopandini CC kubernetes/kubernetes#72870.
This works already but i pushedh the PR in wip for discussing some points. tia for all infos and help 🚀 🌻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants