-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skaffold dev stopped loading built images into local kind cluster since 1.17.1 #5159
Comments
Hi @taisph — could you please rerun with |
And could you please include your ~/.skaffold/config too? |
Using your $ skaffold dev -vdebug
[...]
Tags used in deployment:
- project-service-app -> project-service-app:16bb174b9a147d3f574fb5fe967b7f5c873a0150182dbb0f72d1fb2fffd69a12
DEBU[0000] Local images can't be referenced by digest.
They are tagged and referenced by a unique, local only, tag instead.
See https://skaffold.dev/docs/pipeline-stages/taggers/#how-tagging-works
DEBU[0000] getting client config for kubeContext: `kind-project`
DEBU[0000] could not parse date ""
Loading images into kind cluster nodes...
- project-service-app:16bb174b9a147d3f574fb5fe967b7f5c873a0150182dbb0f72d1fb2fffd69a12 -> DEBU[0000] Running command: [kubectl --context kind-project get nodes -ojsonpath='{@.items[*].status.images[*].names[*]}']
DEBU[0000] Command output: ['k8s.gcr.io/etcd:3.4.13-0 k8s.gcr.io/kube-proxy:v1.19.1 docker.io/kindest/kindnetd:v20200725-4d6bea59 k8s.gcr.io/kube-apiserver:v1.19.1 k8s.gcr.io/kube-controller-manager:v1.19.1 k8s.gcr.io/kube-scheduler:v1.19.1 k8s.gcr.io/build-image/debian-base:v2.1.0 k8s.gcr.io/coredns:1.7.0 docker.io/rancher/local-path-provisioner:v0.0.14 docker.io/library/project-service-app:16bb174b9a147d3f574fb5fe967b7f5c873a0150182dbb0f72d1fb2fffd69a12 k8s.gcr.io/pause:3.3']
Found
Images loaded in 79.888654ms
Starting deploy...
[...] The other thing to check: what version of $ kind version
kind v0.9.0 go1.15.2 darwin/amd64 |
$ skaffold dev -vdebug
[...]
Tags used in deployment:
- project-service-app -> project-service-app:cab7ba6bef78752bd4a887d1930a1da1b2811d5e50eb8c7aa7397bfd07ffba5a
DEBU[0060] Local images can't be referenced by digest.
They are tagged and referenced by a unique, local only, tag instead.
See https://skaffold.dev/docs/pipeline-stages/taggers/#how-tagging-works
DEBU[0060] getting client config for kubeContext: `kind-project`
Starting deploy...
[...] $ cat ~/.skaffold/config
global:
survey:
last-prompted: "2020-10-29T14:50:03+01:00"
kubeContexts: [] $ kind version
kind v0.9.0 go1.14.7 linux/amd64 $ skaffold version
v1.17.2 |
Seems like the kubectl current context must match the context specified in deploy.kubeContext in the skaffold config since 1.17.1. If kubectl context is different, images are not loaded. |
That sounds like you're hitting #4428. |
Could you include your full debug logs please @taisph? Does it make a difference if you remove the |
There's a lot of potentially sensitive information in those debug logs so I'll have to figure out how to anonymize them first before adding it here in full. Removing the Anyway, I just discovered that if I keep running I've added log excerpts below that was created by a cycle of adding a space to a string in the code to ensure a fresh build (except for the cached one) and then running The line below seems to appear in all the runs that does not load the image.
Run with fresh build that didn't load into cluster:
Run with fresh build that did load:
Run with cached build that did load:
|
Still an issue with skaffold v1.19.0. |
Reproduction notes
skaffold config:
|
@taisph i am not able to reproduce this for v1.20.0. |
ok, i was finally able to reproduce this bug.
This happens randomly on any dev iteration. Not sure why. I had previously run |
@tejal29 I don't even see the "Loading images into kind cluster nodes..." message here and for me it just skips directly to deployment most of the time when kubectl is pointing at a different context than skaffold is configured to use. As soon as I switch kubectl context to match skaffold, it performs the loading images step seemingly every time. This goes for skaffold v1.20.0 as well. I've looked into the code and as far as I can tell, the issue for me is that the |
Still doesn't load images into the kind cluster and now port forwarding seems to have issues as well.
Switching the kubectl context back to the local cluster makes skaffold work as expected. Explicitly using |
It seems I hit the same exact issue, see here, for example: https://stackoverflow.com/questions/68414470/skaffold-miss-configuration-or-how-to-set-up-a-simple-helm-example No matter what I do:
things do not work. |
I was able to reproduce the issue locally and work out a fix. I'd be happy for feedback on whether I've chosen the right place to fix it. When determining which images to sideload (for kind and k3d), we compare `localImages` and `deployerImages`. The former from `skaffold.yaml`, and the latter from Kubernetes manifest files. The image names from Kubernetes manifests are sanitized in `pkg/skaffold/kubernetes/manifests/images.go#L51` (and `L113`) in the call to docker.ParseReference. The same doesn't happen to image names from `skaffold.yaml`. This change sanitizes these image names just for determining whether to sideload the images. In other parts of the code we look up image pipelines from `skaffold.yaml` using the image name, so I was hesitant to change how `localImages` is used (with 'raw' image names). The hypothesis from the previous commit is disproven, so I'm adding back the `sha256` tag policy in the custom builder example. To make the test case easier to identify from the build logs, I renamed the pod in the custom builder example. New hypothesis: Could this be related to the issues some users are reporting with images not being sideloaded when using Helm? E.g., GoogleContainerTools#5159
I was able to reproduce the issue locally and work out a fix. I'd be happy for feedback on whether I've chosen the right place to fix it. When determining which images to sideload (for kind and k3d), we compare `localImages` and `deployerImages`. The former from `skaffold.yaml`, and the latter from Kubernetes manifest files. The image names from Kubernetes manifests are sanitized in `pkg/skaffold/kubernetes/manifests/images.go#L51` (and `L113`) in the call to docker.ParseReference. The same doesn't happen to image names from `skaffold.yaml`. This change sanitizes these image names just for determining whether to sideload the images. In other parts of the code we look up image pipelines from `skaffold.yaml` using the image name, so I was hesitant to change how `localImages` is used (with 'raw' image names). The hypothesis from the previous commit is disproven, so I'm adding back the `sha256` tag policy in the custom builder example. To make the test case easier to identify from the build logs, I renamed the pod in the custom builder example. New hypothesis: Could this be related to the issues some users are reporting with images not being sideloaded when using Helm? E.g., GoogleContainerTools#5159
…ple (#6286) * Restore import path with uppercase characters I wasn't able to reproduce the issue described in #5492 and #5505, so this change restores the import path with the uppercase characters and the `ko://` prefix in the custom build example. I had to tweak some values to get the integration tests to run, because I don't have access to the `k8s-skaffold` project. Let's see if the build passes. Additional minor changes: - Bump the ko version in the custom build example. - Add the full path to the ko binary in the custom build script, in case `$GOPATH/bin` is not on the `PATH`. Once we move to Go 1.16 for our builds, we can use the `go install` command to install ko in the custom build shell script. * Look for ko binary in GOPATH/bin It's difficult to know what's on the `PATH` in different environments. This change to the custom builder example looks for the ko binary in the `GOPATH/bin` directory. * Remove tagPolicy from custom builder example Hypothesis: `tagPolicy: sha256` doesn't behave correctly with images sideloaded into kind snf k3d. Also fix conditional in custom build example shell script to prevent recompiling ko each time. * Sanitize image names before deciding what to load I was able to reproduce the issue locally and work out a fix. I'd be happy for feedback on whether I've chosen the right place to fix it. When determining which images to sideload (for kind and k3d), we compare `localImages` and `deployerImages`. The former from `skaffold.yaml`, and the latter from Kubernetes manifest files. The image names from Kubernetes manifests are sanitized in `pkg/skaffold/kubernetes/manifests/images.go#L51` (and `L113`) in the call to docker.ParseReference. The same doesn't happen to image names from `skaffold.yaml`. This change sanitizes these image names just for determining whether to sideload the images. In other parts of the code we look up image pipelines from `skaffold.yaml` using the image name, so I was hesitant to change how `localImages` is used (with 'raw' image names). The hypothesis from the previous commit is disproven, so I'm adding back the `sha256` tag policy in the custom builder example. To make the test case easier to identify from the build logs, I renamed the pod in the custom builder example. New hypothesis: Could this be related to the issues some users are reporting with images not being sideloaded when using Helm? E.g., #5159
Same with k3d cluster, images are not being imported into cluster
And afterwards image cannot be loaded within cluster (162ns also looks too optimistic) Loading the relevant tag by hands via |
Still seeing the same on skaffold v2, kind 0.16.0, k8s 1.25.2. Tested in both Windows 11 and inside WSL2/Ubuntu 22.04. skaffold dev
# ...
Loading images into kind cluster nodes...
Images loaded in 74ns But running $ docker exec kind-control-plane crictl images
IMAGE TAG IMAGE ID SIZE
docker.io/kindest/kindnetd v20220726-ed811e41 d921cee849482 25.8MB
docker.io/kindest/local-path-helper v20220607-9a4d8d2a d2f902e939cc3 2.86MB
docker.io/kindest/local-path-provisioner v0.0.22-kind.0 4c1e997385b8f 17.4MB
registry.k8s.io/coredns/coredns v1.9.3 5185b96f0becf 14.8MB
registry.k8s.io/etcd 3.5.4-0 a8a176a5d5d69 102MB
registry.k8s.io/kube-apiserver v1.25.2 9eebd178240fb 76.5MB
registry.k8s.io/kube-controller-manager v1.25.2 d846cf6e13f87 64.5MB
registry.k8s.io/kube-proxy v1.25.2 817f51628b39c 63.3MB
registry.k8s.io/kube-scheduler v1.25.2 bbff39abe40b4 51.9MB
registry.k8s.io/pause 3.7 221177c6082a8 311kB |
This completely stopped working with skaffold v2 as it seems. |
I've just had a look at the source code, trying to add some Printf debugging with my non-existent Go skills. I believe the issue may be in https://github.com/GoogleContainerTools/skaffold/blob/v2.0.0/pkg/skaffold/deploy/helm/helm.go#L240. It always passes I admittedly don't entirely understand the difference between |
@chgl can confirm that images are not loaded into a k3d cluster when using Helm to deploy. |
I've also taken a look into the code when using kustomize/kubectl and the followings are my findings. First thing I recognized the description in https://github.com/GoogleContainerTools/skaffold/blob/v2.0.1/pkg/skaffold/deploy/kubectl/kubectl.go#L68-L69 seems to be wrong. I guess it should be the other way around? Still I don't know what "originalImages" should be for. If the images should be parsed out of the manifests to be handed to the LoadImages function I can't see where this should happen. It's never set and originalImages is always an empty slice. Therefore the Going further down the road and fixing that code (read: removing the checks for "deployerImages" which is currently dead/unusable code in this case), we're thrown out because we're passing a Fixing those reenables image deployment when using kustomize and kind. I'm not sure how this whole image selection was meant to work in the beginning so unfortunately I cannot work on a pull request to fix it. |
Just learned that at least part of this issue is the same as #7992 . There's also a MR linked that looks like it's solving the issue. |
Should be fixed now by #8007, i'm gonna close this issue. Please feel free to re-open it if you guys see the same issue again after V 2.0.2 |
Expected behavior
skaffold dev
loads built images into local kind cluster before deploying.Using skaffold 1.17.0:
Actual behavior
skaffold dev
skips the load.Using skaffold 1.17.2:
Information
Tested with empty skaffold config and
kind-disable-load
explicitly set to false.The text was updated successfully, but these errors were encountered: