Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImagePullPolicy should be configurable for every container #14698

Closed
l0rd opened this issue Sep 27, 2019 · 18 comments
Closed

ImagePullPolicy should be configurable for every container #14698

l0rd opened this issue Sep 27, 2019 · 18 comments
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/enhancement A feature request - must adhere to the feature request template. severity/P1 Has a major impact to usage or development of the system.
Milestone

Comments

@l0rd
Copy link
Contributor

l0rd commented Sep 27, 2019

Is your enhancement related to a problem? Please describe.

When deploying Che some images pull policies can be set easily (i.e. for che-server) but others are not and it's imagePullPolicy: Always no matter what.

That makes it impossible to start Che or even a workspace when offline. Even if all the images are all available on the local disk.

Describe the solution you'd like

JWT proxy image pull policy should be configurable (in che.properties, helm chart and operator)

With the Helm chart it should be possible (but it's currently not) to:

With the Operator it should be possible (but it's currently not) to:

@l0rd l0rd added the kind/enhancement A feature request - must adhere to the feature request template. label Sep 27, 2019
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Sep 27, 2019
@tolusha
Copy link
Contributor

tolusha commented Sep 28, 2019

@l0rd
How important offline installation is?

@tolusha
Copy link
Contributor

tolusha commented Sep 28, 2019

Since it is related to installation mechanism I mark this issue as P1

@tolusha tolusha added severity/P1 Has a major impact to usage or development of the system. team/platform area/install Issues related to installation, including offline/air gap and initial setup and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. area/install Issues related to installation, including offline/air gap and initial setup labels Sep 28, 2019
@davidfestal
Copy link
Contributor

Let me provide information about the status of this on the Operator side:

With the Operator it should be possible (but it's currently not) to:

* Configure operator pull policy

This one is a bit special, since its overriding depends on the method of the Che operator installation:

  • from OperatorHub: In the OLM CSVs, pull policy is set to IfNotPresent for stable channel (releases), and to Always for the nightly channel.
  • from chectl: By default the Always pull policy is used since by default chectl installs the nightly operator. But I assume we could create an issue to allow overriding it in the chectl CLI
  • from any deploy.sh script: again, the pull policy could easily be overridden when applying the operator deployment
* Configure che-server pull policy
* Configure plugin registry pull policy
* Configure devfile registry pull policy
* Configure postgres pull policy
* Configure keycloak pull policy

Already available as fields in the custom resource.

* Configure plugin broker pull policy
* Configure sidecars pull policy

This is already possible by setting the appropriate Che server environment variable:

@l0rd
Copy link
Contributor Author

l0rd commented Sep 30, 2019

Thanks @davidfestal I have updated the description

@l0rd l0rd added this to the Backlog - Platform milestone Oct 10, 2019
@svkr2k
Copy link

svkr2k commented Nov 28, 2019

It is mentioned that:

Already available as fields in the custom resource.
* Configure plugin broker pull policy
* Configure sidecars pull policy

I would like to set image pull policy for cheEditor plugin (theia ide) and another chePlugin type of plugin. Please show sample code/configuration.

@sleshchenko
Copy link
Member

@svkr2k I did not found where it's declared in CR model, but you should be able to configure these properties via customCheProperties. So, edit your Che Cluster CR

spec:
  server:
    customCheProperties:
      CHE_WORKSPACE_SIDECAR_IMAGE__PULL__POLICY: IfNotPresent
      CHE_WORKSPACE_PLUGIN__BROKER_PULL__POLICY: IfNotPresent

I would like to set image pull policy for cheEditor plugin (theia ide) and another chePlugin type of plugin

⚠️ But note that it overrides pull policy for all plugins and editors.

@davidwindell
Copy link
Contributor

How do we define IfNotPresent for Helm deployments for the jwt-proxy and plugin brokers? It would really speed up our workspace start times:

Screenshot from 2020-01-08 10-01-22

@davidwindell
Copy link
Contributor

davidwindell commented Feb 6, 2020

I think the images below are the main ones slowing WS startup for us now, they are set to pull always and there is no option to configure:

che-plugin-metadata-broker
jwt-proxy
che-machine-exec
che-theia-endpoint-runtime-binary

If these were able to be set to IfNotPresent by default, I think our workspaces would start in under 10 seconds :)

@tolusha tolusha added area/install Issues related to installation, including offline/air gap and initial setup team/deploy and removed team/platform labels Feb 28, 2020
@amisevsk
Copy link
Contributor

amisevsk commented Mar 4, 2020

@davidwindell Sorry to pop in after a month, I agree that we should be able to set PullPolicy for every container, but wanted to share some background on how this affects workspace startup times.

The messages you are seeing do not necessarily mean that the entire image is being pulled every time, and it's due to some confusing semantics around pull policies.

A pull policy of Always means "always check that the image is available in one of the registries I can read". This means pulling the image from the registry when it is not present, of course, but for images already downloaded, it behaves like docker pull does on a local machine -- it gets the layer digests and compares them against what it has locally and if they match, downloads nothing. One reason this is sometimes done is to avoid users accessing images they do not have public access to, as without the registry check, you could ostensibly start a container using an image from my private registry if you know its tag and it is present on the node.

This means that IfNotPresent doesn't have as much of an effect as we might like. If the image is already downloaded, it only removes the need to check the layers against the registry, which is a relatively fast operation. If the image is not already downloaded, then the two policies behave the same.

I did a fairly deep-dive into this last year, but I can't find the issue where I shared some graphs on how pulling affects workspace start time. The long and short of is is that if images are already cached on the node where the workspace is running, we save at most 3-5 seconds per workspace start by setting IfNotPresent. This is also why we're able to get ~35 second workspace start times on che.openshift.io at the moment, despite pull policy being forced to Always for the security reason described above.

We have a utility you might be interested in taking a look at, that we originally wrote to support pre-pulling workspace images on che.openshift.io: https://github.com/che-incubator/kubernetes-image-puller. It basically creates a daemonset on the cluster that keeps the main large images in a workspace running so that they are always present on every node.

@metlos
Copy link
Contributor

metlos commented Mar 4, 2020

While I agree with everything @amisevsk said, there is a property che.workspace.sidecar.image_pull_policy, settable as an env var CHE_WORKSPACE_SIDECAR_IMAGE__PULL__POLICY that you can use to change the default image pull policy of the workspace sidecars (this is currently ignored by jwtproxy but there is a PR open to fix that already).

Heh, I should read ALL the previous comments before I post something :) My point was made others already :)

@skabashnyuk
Copy link
Contributor

@l0rd I've just discussed this topic again with @tolusha .
We were agreed that the platform team can ensure that all ImagePullPolicy of all components of the workspace can be configured by sidecars pull policy or plugin broker pull policy other parts of this issue would be covered by deploy team.
I have in mind to check: init containers, k8s components of devfile.

@tolusha tolusha mentioned this issue Mar 5, 2020
45 tasks
@davidwindell
Copy link
Contributor

@skabashnyuk don't forget the jwt-proxy in that list :)

@tolusha tolusha added this to the Backlog - Deploy milestone Mar 20, 2020
@skabashnyuk
Copy link
Contributor

Default confiruation

2020-03-23 16:03:14,747[nio-8080-exec-8]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 479]   - Starting workspace 'skabashn/wksp-n3ej' with id 'workspacerd4p6ieyk7s1yatl' by user 'skabashn'
2020-03-23 16:03:14,990[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=mkdir-workspacerd4p6ieyk7s1yatl policy=IfNotPresent
2020-03-23 16:03:29,192[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=che-plugin-metadata-broker-v3-1-1 policy=Always
2020-03-23 16:03:40,652[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=mkdir-workspacerd4p6ieyk7s1yatl policy=IfNotPresent
2020-03-23 16:03:50,545[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=che-jwtproxyi1e8s4q2 policy=Always
2020-03-23 16:03:50,545[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=maven policy=null
2020-03-23 16:03:50,545[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-xmlkfx policy=Always
2020-03-23 16:03:50,546[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-apache-camelst2 policy=Always
2020-03-23 16:03:50,546[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-javaqcn policy=Always
2020-03-23 16:03:50,546[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=che-machine-exec9ks policy=Always
2020-03-23 16:03:50,547[aceSharedPool-2]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=theia-idets9 policy=Always
2020-03-23 16:04:56,254[aceSharedPool-2]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 925]   - Workspace 'skabashn:wksp-n3ej' with id 'workspacerd4p6ieyk7s1yatl' started by user 'skabashn'

Configuration with:

      CHE_WORKSPACE_PLUGIN__BROKER_PULL__POLICY: Never
      CHE_WORKSPACE_SIDECAR_IMAGE__PULL__POLICY: Never
2020-03-23 16:08:48,164[nio-8080-exec-6]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 479]   - Starting workspace 'skabashn/wksp-n3ej' with id 'workspacerd4p6ieyk7s1yatl' by user 'skabashn'
2020-03-23 16:08:48,883[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=mkdir-workspacerd4p6ieyk7s1yatl policy=IfNotPresent
2020-03-23 16:08:58,662[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=che-plugin-metadata-broker-v3-1-1 policy=Never
2020-03-23 16:09:10,537[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 251]  - >>>Container=mkdir-workspacerd4p6ieyk7s1yatl policy=IfNotPresent
2020-03-23 16:09:20,434[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=che-jwtproxy636kyhqj policy=Never
2020-03-23 16:09:20,439[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=maven policy=null
2020-03-23 16:09:20,439[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-xmln8v policy=Never
2020-03-23 16:09:20,441[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-apache-camelivp policy=Never
2020-03-23 16:09:20,442[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=vscode-javafke policy=Never
2020-03-23 16:09:20,443[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=che-machine-execj01 policy=Never
2020-03-23 16:09:20,445[aceSharedPool-1]  [INFO ] [.w.i.k.n.KubernetesDeployments 208]  - >><<<<<<Container=theia-ide9pw policy=Never
2020-03-23 16:09:55,939[aceSharedPool-1]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 925]   - Workspace 'skabashn:wksp-n3ej' with id 'workspacerd4p6ieyk7s1yatl' started by user 'skabashn'

I see two interesting parts.

  • mkdir- containers have IfNotPresent policy always.
  • some containers have null aka default policy

@tolusha
Copy link
Contributor

tolusha commented Mar 24, 2020

Is there any way to fix for mkdir and stack containers?

@skabashnyuk
Copy link
Contributor

Is there any way to fix for mkdir and stack containers?

I'm thinking about that

@davidwindell
Copy link
Contributor

@tolusha so now you've closed this, can you confirm the ImagePullPolicy can be specified for the jwt proxy, etc?

che-plugin-metadata-broker
jwt-proxy
che-machine-exec
che-theia-endpoint-runtime-binary

@tolusha
Copy link
Contributor

tolusha commented Apr 21, 2020

@davidwindell
I personally haven't tested yet. But from what have been discussed with @skabashnyuk , so yes.

spec:
  server:
    customCheProperties:
      CHE_WORKSPACE_PLUGIN__BROKER_PULL__POLICY:
      CHE_WORKSPACE_SIDECAR_IMAGE__PULL__POLICY:
      CHE_INFRA_KUBERNETES_PVC_JOBS_IMAGE_PULL__POLICY:

@davidwindell
Copy link
Contributor

Awesome, thanks @tolusha - this works and workspace startup time is so much faster with IfNotPresent now, ~10 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/enhancement A feature request - must adhere to the feature request template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

10 participants