Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube "start" fails when using the "--cache-images" option #3250

Closed
mlgibbons opened this issue Oct 13, 2018 · 5 comments
Closed

Minikube "start" fails when using the "--cache-images" option #3250

mlgibbons opened this issue Oct 13, 2018 · 5 comments
Labels
co/virtualbox help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. os/windows priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/obsolete Bugs that no longer occur in the latest stable release

Comments

@mlgibbons
Copy link
Contributor

BUG REPORT
Environment Windows 10
Minikube version 0.30.0
OS NA
VM Driver VirtualBox
ISO version 0.30.0

What happened: Running "minikube start" with the "--cache-images" option hangs as it appears that image caching fails to complete.

Waiting for image caching to complete...
Moving files into cluster...
.......... hangs at this point

How to reproduce it (as minimally and precisely as possible):
"minikube start --cache-images"

Output of minikube logs (if applicable):
Adding the "-v 10" option to the start command results in the following error being displayed:

Waiting for image caching to complete...
E1013 14:36:00.778091 18324 start.go:247] Error caching images: Caching images for kubeadm: caching images: caching image C:\Users\Mark.minikube\cache\images\k8s.gcr.io\pause-amd64_3.1: getting destination path: parsing docker archive dst ref: replace a Win drive letter to a volume name: CreateFile \?\Volume{806aa746-d64c-11e5-acef-806e6f6e6963}\Users\Mark.minikube\cache\images\k8s.gcr.io: The system cannot find the path specified.
Moving files into cluster...

Anything else do we need to know:

@mlgibbons
Copy link
Contributor Author

mlgibbons commented Oct 13, 2018

This seems to be due to the code not mapping the Windows drive letter correctly and is somehow related to the case of it; specifically when a lowercase drive letter is used in one of the env variables that is when the problem occurs. I recall coming across this problem a little while ago while testing the caching changes.

Looking at the code it seems like HOME, USERPROFILE and MINIKUBE_HOME as used to determine the path for the cache but I'm unclear on the exact logic.

Having tried some experiments with different setting for those env variables I don't have a totally clear answer on the cause but the results were:

  • Default No change = error
    ?\Volume{806aa746-d64c-11e5-acef-806e6f6e6963}\Users\Mark.minikube\cache\images\k8s.gcr.io

  • If HOMEDRIVE is set using the lowercase drive letter the result is shown below. While not correct (the images are now cached under the .minikube.cache dir as expected and we see "c_" instead of "c:" which does not bode well) it does now cache the images althought they are now under "C:\Users<user>\c_\Users<user>.minikube\cache\images.

  • unset HOME and HOMEDRIVE and explicitly set MINIKUBE_HOME with uppercase drive letter
    Works better and puts files in the right location but fails on one of the files

I1013 16:10:06.190145 16096 ssh_runner.go:57] Run: sudo rm -f /etc/docker/ca.pem
I1013 16:10:06.199945 16096 ssh_runner.go:57] Run: sudo mkdir -p /etc/docker
I1013 16:10:06.218493 16096 ssh_runner.go:57] Run: sudo rm -f /etc/docker/server.pem
I1013 16:10:06.228256 16096 ssh_runner.go:57] Run: sudo mkdir -p /etc/docker
I1013 16:10:06.247748 16096 ssh_runner.go:57] Run: sudo rm -f /etc/docker/server-key.pem
I1013 16:10:06.260440 16096 ssh_runner.go:57] Run: sudo mkdir -p /etc/docker
Getting VM IP address...
Waiting for image caching to complete...
E1013 16:10:09.675692 16096 start.go:247] Error caching images: Caching images for kubeadm: caching images: caching image C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-proxy-amd64_v1.11.0: getting destination path: parsing docker archive dst ref: replace a Win drive letter to a volume name: CreateFile \?\Volume{806aa746-d64c-11e5-acef-806e6f6e6963}\User\Mark.minikube\cache\images\k8s.gcr.io: The system cannot find the path specified.
Moving files into cluster...
I1013 16:10:09.677646 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\gcr.io\k8s-minikube\storage-provisioner_v1.8.1
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\etcd-amd64_3.1.12
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\k8s-dns-kube-dns-amd64_1.14.8
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-scheduler-amd64_v1.11.0
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-controller-manager-amd64_v1.11.0 I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-apiserver-amd64_v1.11.0
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\pause-amd64_3.1
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-proxy-amd64_v1.11.0
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\k8s-dns-dnsmasq-nanny-amd64_1.14.8
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\k8s-dns-sidecar-amd64_1.14.8
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kubernetes-dashboard-amd64_v1.10.0
I1013 16:10:09.678620 16096 cache_images.go:197] Loading image from cache at C:\User\Mark.minikube\cache\images\k8s.gcr.io\kube-addon-manager_v8.6

So, in summary, I'm not quite sure what is going on yet but it seems to be due to the drive to volume mapping and I suspect strongly that it is influenced by one of the env variables having a lowercase drive letter.

I will endeavour to figure it out but it is a painful process as the mk VM takes ages to boot when using VirtualBox and I've no idea why that is.

@mlgibbons
Copy link
Contributor Author

mlgibbons commented Oct 14, 2018

It appears that the problem related to lowercase drive letters is caused by this piece of code in cache_images.go as it does not recognise drive letters in lower case. This should be a simple change of adding the set of lowercase lettersto the current set of uppercase ones.

func hasWindowsDriveLetter(s string) bool {
	if len(s) < 3 {
		return false
	}

	drive := s[:3]
	for _, b := range "CDEFGHIJKLMNOPQRSTUVWXYZAB" {
		if d := string(b) + ":"; drive == d+`\` || drive == d+`/` {
			return true
		}
	}

	return false
}

** If the drive letters in the env are changed to uppercase then all of the images - bar one - download **

The image which then fails in downloading is the the storage provisioner. This is because this image is in a dir with two levels while other are just in a single level dir i.e. ...\cache\images\gcr.io\k8s-minikube vs ...\cache\images\k8s.gcr.io.

This means that the code in replaceWinDriveLetterToVolumeName() which stats the dir fails - it works fine for images which are not in nested dirs.

func replaceWinDriveLetterToVolumeName(s string) (string, error) {
	vname, err := getWindowsVolumeName(s[:1])
	if err != nil {
		return "", err
	}
	path := vname + s[3:]
	if _, err := os.Stat(filepath.Dir(path)); err != nil {
		return "", err
	}

	return path, nil
}

** If the dir is manually created then all the images are downloaded correctly and the cluster starts up fine **.

Since the full image cache dir structure is created in CacheImage after the path is calculated the fix would seem to be to simply remove the check in replaceWinDriveLetterToVolumeName.

From CacheImage:

....
	if err := os.MkdirAll(filepath.Dir(dstPath), 0777); err != nil {
		return errors.Wrapf(err, "making cache image directory: %s", dst)
	}
....

I shall make the relevant changes and test once I've got a build env setup sometime later this week.

@mlgibbons
Copy link
Contributor Author

I've now made the changes and raised a PR for them.

@tstromberg tstromberg added os/windows co/virtualbox kind/bug Categorizes issue or PR as related to a bug. labels Oct 30, 2018
@tstromberg
Copy link
Contributor

Can anyone confirm whether or not --cache-images now works on Windows?

@tstromberg tstromberg added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 23, 2019
@tstromberg tstromberg added the triage/obsolete Bugs that no longer occur in the latest stable release label Feb 19, 2019
@ghd258
Copy link

ghd258 commented Oct 21, 2019

PS C:\Users\Administrator> minikube.exe start --registry-mirror=https://h35mk3lp.mirror.aliyuncs.com --vm-driver="hyperv" --memory=4096 --hyperv-virtual-switch="MinikubeSwitch"

  • minikube v1.4.0 on Microsoft Windows 10 Enterprise 10.0.18362 Build 18362
  • Creating hyperv VM (CPUs=2, Memory=4096MB, Disk=20000MB) ...
    E1021 17:31:33.381591 14408 cache_images.go:79] CacheImage k8s.gcr.io/kube-scheduler:v1.16.0 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\kube-scheduler_v1.16.0 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.390593 14408 cache_images.go:79] CacheImage k8s.gcr.io/kube-apiserver:v1.16.0 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\kube-apiserver_v1.16.0 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.399594 14408 cache_images.go:79] CacheImage gcr.io/k8s-minikube/storage-provisioner:v1.8.1 -> C:\Users\Administrator.minikube\cache\images\gcr.io\k8s-minikube\storage-provisioner_v1.8.1 failed: fetching image: Get https://gcr.io/v2/: dial tcp 108.177.97.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.496596 14408 cache_images.go:79] CacheImage k8s.gcr.io/pause:3.1 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\pause_3.1 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.527595 14408 cache_images.go:79] CacheImage k8s.gcr.io/kube-controller-manager:v1.16.0 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\kube-controller-manager_v1.16.0 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.533592 14408 cache_images.go:79] CacheImage k8s.gcr.io/etcd:3.3.15-0 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\etcd_3.3.15-0 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.587590 14408 cache_images.go:79] CacheImage k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.13 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\k8s-dns-sidecar-amd64_1.14.13 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.595591 14408 cache_images.go:79] CacheImage k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.13 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\k8s-dns-kube-dns-amd64_1.14.13 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.672594 14408 cache_images.go:79] CacheImage k8s.gcr.io/coredns:1.6.2 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\coredns_1.6.2 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.714591 14408 cache_images.go:79] CacheImage k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\k8s-dns-dnsmasq-nanny-amd64_1.14.13 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    E1021 17:31:33.731594 14408 cache_images.go:79] CacheImage k8s.gcr.io/kube-addon-manager:v9.0.2 -> C:\Users\Administrator.minikube\cache\images\k8s.gcr.io\kube-addon-manager_v9.0.2 failed: fetching image: Get https://k8s.gcr.io/v2/: dial tcp 64.233.189.82:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
co/virtualbox help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. os/windows priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/obsolete Bugs that no longer occur in the latest stable release
Projects
None yet
Development

No branches or pull requests

3 participants