Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto detect CRI #1117

Closed
rosti opened this issue Sep 14, 2018 · 13 comments
Closed

Auto detect CRI #1117

rosti opened this issue Sep 14, 2018 · 13 comments
Assignees
Labels
area/ecosystem area/UX kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@rosti
Copy link

rosti commented Sep 14, 2018

With CRIs different than Docker gaining more and more popularity, we need to improve the user experience of kubeadm by not having to supply --cri-socket on every command if something different than Docker is used.

Most of the other CRIs have well known default paths for Unix domain sockets, thus detecting the default installations of CRIs on the local machine should not be too difficult.

Currently kubeadm will fail if --cri-socket is not specified and Docker is not installed on the system. But the bigger problem here is that, if --cri-socket is forgotten and Docker is installed, the wrong CRI could be used and the operation to succeed (like with kubeadm init).

/area UX
/area ecosystem
/kind feature

@k8s-ci-robot k8s-ci-robot added area/UX area/ecosystem kind/feature Categorizes issue or PR as related to a new feature. labels Sep 14, 2018
@rosti
Copy link
Author

rosti commented Sep 14, 2018

/assign

@neolit123 neolit123 added this to the v1.13 milestone Sep 14, 2018
@fabriziopandini
Copy link
Member

fabriziopandini commented Sep 15, 2018

+1
Work was done in the past two cycles for laying foundations for implementing what you are suggesting (the cri annotation on nodes). We can start from this and fix other commands

@bart0sh
Copy link

bart0sh commented Sep 17, 2018

@rosti

the bigger problem here is that, if --cri-socket is forgotten and Docker is installed, the wrong CRI could be used

How automatic detection of CRI runtime will help to solve this? If we have docker and CRI runtimes running and kubeadm detects CRI runtime it still has to decide which one to use.

Currently --cri-socket is an explicit way to point out that user wants to use CRI runtime. If we remove this we'll have to set clear priorities and let the user know about them somehow. Otherwise it would create more confusion then current approach.

@rosti
Copy link
Author

rosti commented Sep 17, 2018

@bart0sh I think of a more simple approach on that. Something along the lines of:

  1. Detect all possible CRIs we can use.
  2. If precisely one is found, use that one, else bail out, asking for --cri-socket to be supplied.

The idea is not to cover all cases, but only the major case, where only one CRI is installed and that's not Docker.
Covering the case, where more than one CRI is installed and Docker is one of them is a good thing to do too.

@bart0sh
Copy link

bart0sh commented Sep 17, 2018

@rosti thanks for the explanations. That makes sense to do, agree.

In case more than one runtime is detected kubeadm can either bail out or simply ask user which one to use.

@bart0sh
Copy link

bart0sh commented Sep 20, 2018

@rosti Another suggestion is to avoid using crictl when detecting CRI. Using CRI API call[s] would be better in my opinion. It will keep kubeadm independent of crictl when using default (Docker) runtime.

@bart0sh
Copy link

bart0sh commented Sep 20, 2018

@rosti I'd be happy to implement this. Just assign me to this issue if you want me to do it.

@rosti
Copy link
Author

rosti commented Sep 20, 2018

@bart0sh I have done some work on it and I'll soon be ready to PR it (just needs tests written). It's on the above points and simply checks for the existence of several well known domain sockets. It doesn't check the docker or crictl executables.
I'll ping you for a review when this is ready.

@bart0sh
Copy link

bart0sh commented Sep 20, 2018

@rosti sure. one note here. You should probably consider checking if CRI API is accessible through the socket. Checking file existence is not enough from my point of view.

@timothysc timothysc added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Oct 30, 2018
@timothysc timothysc modified the milestones: v1.13, v1.14 Nov 14, 2018
@rosti
Copy link
Author

rosti commented Jan 28, 2019

As the main PR is now merged, I think, that we should document this somewhere. Most importantly, we should document the sockets and CRIs we are able to detect as of 1.14.

@neolit123 et al, I think, that these are the possible places for that purpose:

Possibly some place else?

@neolit123
Copy link
Member

this seems like a good place to mention the auto detection:
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#initializing-your-master

@timothysc timothysc added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Feb 7, 2019
@chuckha
Copy link

chuckha commented Feb 11, 2019

Big +1 from me since cluster-api-provider-aws defaults to containerd now and it's not the nicest experience anymore.

One associated problem is that kubeadm is printing a stack trace for users to see in addition to a sufficient message for the user to debug. Kubeadm should only print the error message and not the trace in this case.

E0211 14:59:36.568191    3305 reset.go:169] [reset] failed to remove containers: exec: "docker": executable file not found in $PATH
docker is required for container runtime
k8s.io/kubernetes/cmd/kubeadm/app/util/runtime.NewContainerRuntime
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/util/runtime/runtime.go:72
k8s.io/kubernetes/cmd/kubeadm/app/cmd.removeContainers
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/reset.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd.(*Reset).Run
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/reset.go:168
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdReset.func1
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/reset.go:71
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794
k8s.io/kubernetes/cmd/kubeadm/app.Run
	/workspace/anago-v1.13.2-beta.0.75+cff46ab41ff0bb/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48
main.main
	_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29
runtime.main
	/usr/local/go/src/runtime/proc.go:201
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1333

My hope is for this issue to fix the inappropriate stack trace. I'll open a separate issue if it is desired.

@rosti
Copy link
Author

rosti commented Feb 12, 2019

This is now implemented and documented, so I close it.

@rosti rosti closed this as completed Feb 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem area/UX kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

7 participants