Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECK: Init container - install-plugins CrashLoopBackoff #55443

Closed
rgarcia89 opened this issue Apr 20, 2020 · 10 comments
Closed

ECK: Init container - install-plugins CrashLoopBackoff #55443

rgarcia89 opened this issue Apr 20, 2020 · 10 comments
Labels
:Core/Infra/Plugins Plugin API and infrastructure Team:Core/Infra Meta label for core/infra team

Comments

@rgarcia89
Copy link

Elasticsearch version (bin/elasticsearch --version): 7.6.2

Plugins installed: [repository-azure, repository-s3]

JVM version (java -version):

OS version (uname -a if on a Unix-like system): CentOS 7.7 - Linux 5-21-282-887-1-2338c741 5.6.2-1.el7.elrepo.x86_64 #1 SMP Thu Apr 2 10:55:54 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
On our DEV Cluster we do have an issue with the init container "install-plugins". It appears when resuming the cluster. As the DEV-Cluster runs in the cloud, we suspend the nodes between 9pm and 6am. Since adding the init container to the ECK deployment, however the elastic pod stuck in Init:CrashLoopBackOff. From the logs I can see the following:

Name:         elastic-es-master-2
Namespace:    elastic-system
Priority:     0
Node:         5-21-282-888-1-2338c778/10.17.34.181
Start Time:   Fri, 17 Apr 2020 12:18:13 +0200
Labels:       common.k8s.elastic.co/type=elasticsearch
              controller-revision-hash=elastic-es-master-6756b9949
              elasticsearch.k8s.elastic.co/cluster-name=elastic
              elasticsearch.k8s.elastic.co/config-hash=1748022618
              elasticsearch.k8s.elastic.co/http-scheme=https
              elasticsearch.k8s.elastic.co/node-data=false
              elasticsearch.k8s.elastic.co/node-ingest=false
              elasticsearch.k8s.elastic.co/node-master=true
              elasticsearch.k8s.elastic.co/node-ml=true
              elasticsearch.k8s.elastic.co/statefulset-name=elastic-es-master
              elasticsearch.k8s.elastic.co/version=7.6.2
              statefulset.kubernetes.io/pod-name=elastic-es-master-2
Annotations:  cni.projectcalico.org/podIP: 10.101.2.143/32
              cni.projectcalico.org/podIPs: 10.101.2.143/32
              update.k8s.elastic.co/timestamp: 2020-04-20T05:27:08.432135417Z
Status:       Running
IP:           10.101.2.143
IPs:
  IP:           10.101.2.143
Controlled By:  StatefulSet/elastic-es-master
Init Containers:
  elastic-internal-init-filesystem:
    Container ID:  docker://e872c0eb341c6fa76f8a8b89d548a856e924f9dcc1891fb90dda3bd6ac9c0e24
    Image:         docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    Image ID:      docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:59342c577e2b7082b819654d119f42514ddf47f0699c8b54dc1f0150250ce7aa
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      /mnt/elastic-internal/scripts/prepare-fs.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 20 Apr 2020 06:45:19 +0200
      Finished:     Mon, 20 Apr 2020 06:45:19 +0200
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_IP:     (v1:status.podIP)
      POD_NAME:  elastic-es-master-2 (v1:metadata.name)
      POD_IP:     (v1:status.podIP)
      POD_NAME:  elastic-es-master-2 (v1:metadata.name)
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-bin-local from elastic-internal-elasticsearch-bin-local (rw)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/elasticsearch-config-local from elastic-internal-elasticsearch-config-local (rw)
      /mnt/elastic-internal/elasticsearch-plugins-local from elastic-internal-elasticsearch-plugins-local (rw)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/transport-certificates from elastic-internal-transport-certificates (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
  install-plugins:
    Container ID:  docker://72b92f2b64ff422d2bced32af4f1e52b1b8300328b09e7a63697004361661630
    Image:         docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    Image ID:      docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:59342c577e2b7082b819654d119f42514ddf47f0699c8b54dc1f0150250ce7aa
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      bin/elasticsearch-plugin install --batch repository-azure
      bin/elasticsearch-plugin install --batch repository-s3
      
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 20 Apr 2020 07:55:37 +0200
      Finished:     Mon, 20 Apr 2020 07:55:42 +0200
    Ready:          False
    Restart Count:  18
    Environment:
      POD_IP:     (v1:status.podIP)
      POD_NAME:  elastic-es-master-2 (v1:metadata.name)
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/bin from elastic-internal-elasticsearch-bin-local (rw)
      /usr/share/elasticsearch/config from elastic-internal-elasticsearch-config-local (rw)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/config/transport-certs from elastic-internal-transport-certificates (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
      /usr/share/elasticsearch/plugins from elastic-internal-elasticsearch-plugins-local (rw)
Containers:
  elasticsearch:
    Container ID:   docker://cfc2b1dc9770f859384997259f3bfefd5e681569966df81a93766595fbfbc98b
    Image:          docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    Image ID:       docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:59342c577e2b7082b819654d119f42514ddf47f0699c8b54dc1f0150250ce7aa
    Ports:          9200/TCP, 9300/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Terminated
      Reason:       Error
      Exit Code:    143
      Started:      Fri, 17 Apr 2020 12:18:39 +0200
      Finished:     Fri, 17 Apr 2020 21:02:54 +0200
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  2Gi
    Requests:
      memory:   2Gi
    Readiness:  exec [bash -c /mnt/elastic-internal/scripts/readiness-probe-script.sh] delay=10s timeout=5s period=5s #success=1 #failure=3
    Environment:
      HEADLESS_SERVICE_NAME:     elastic-es-master
      NSS_SDB_USE_CACHE:         no
      POD_IP:                     (v1:status.podIP)
      POD_NAME:                  elastic-es-master-2 (v1:metadata.name)
      PROBE_PASSWORD_PATH:       /mnt/elastic-internal/probe-user/elastic-internal-probe
      PROBE_USERNAME:            elastic-internal-probe
      READINESS_PROBE_PROTOCOL:  https
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/bin from elastic-internal-elasticsearch-bin-local (rw)
      /usr/share/elasticsearch/config from elastic-internal-elasticsearch-config-local (rw)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/config/transport-certs from elastic-internal-transport-certificates (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
      /usr/share/elasticsearch/plugins from elastic-internal-elasticsearch-plugins-local (rw)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  elasticsearch-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elasticsearch-data-elastic-es-master-2
    ReadOnly:   false
  downward-api:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
  elastic-internal-elasticsearch-bin-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-elasticsearch-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-es-master-es-config
    Optional:    false
  elastic-internal-elasticsearch-config-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-elasticsearch-plugins-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-http-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-es-http-certs-internal
    Optional:    false
  elastic-internal-probe-user:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-es-internal-users
    Optional:    false
  elastic-internal-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      elastic-es-scripts
    Optional:  false
  elastic-internal-transport-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-es-transport-certificates
    Optional:    false
  elastic-internal-unicast-hosts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      elastic-es-unicast-hosts
    Optional:  false
  elastic-internal-xpack-file-realm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-es-xpack-file-realm
    Optional:    false
  elasticsearch-logs:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                  From                              Message
  ----     ------   ----                 ----                              -------
  Normal   Pulled   10m (x17 over 70m)   kubelet, 5-21-282-888-1-2338c778  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.6.2" already present on machine
  Warning  BackOff  46s (x314 over 69m)  kubelet, 5-21-282-888-1-2338c778  Back-off restarting failed container

Logs of the init container - install-plugins:

[raulgs@raulgs-xm1 extras]$ klogs -f pod/elastic-es-master-2 install-plugins
-> Installing repository-azure
-> Downloading repository-azure from elastic
-> Failed installing repository-azure
-> Rolling back repository-azure
-> Rolled back repository-azure
ERROR: plugin directory [/usr/share/elasticsearch/plugins/repository-azure] already exists; if you need to update the plugin, uninstall it first using command 'remove repository-azure'
-> Installing repository-s3
-> Downloading repository-s3 from elastic
-> Failed installing repository-s3
-> Rolling back repository-s3
-> Rolled back repository-s3
ERROR: plugin directory [/usr/share/elasticsearch/plugins/repository-s3] already exists; if you need to update the plugin, uninstall it first using command 'remove repository-s3'

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Try to run the init container - install plugins on a pod that already got the plugin installed
  2. Followed the following guide line: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-init-containers-plugin-downloads.html

What I would expect:
Maybe it would be better, if the install command checks and if the same version is already installed, and if so ends with a INFO message saying already installed, instead of an error. Otherwise, we need to build a condition that checks if the plugin is already installed before running the install command.

@sebgl sebgl transferred this issue from elastic/elasticsearch Apr 20, 2020
@sebgl
Copy link

sebgl commented Apr 20, 2020

Transfered the issue to https://github.com/elastic/cloud-on-k8s.

Edit: transfered it back to the Elasticsearch repo. Sorry for the confusion. I initially misread the first post.
The ask here is to improve the elasticsearch-plugin tool. ECK is just mentioning its usage in this doc.

@sebgl sebgl changed the title ECK: Init container - install-plugins CrashLoopBackup ECK: Init container - install-plugins CrashLoopBackoff Apr 20, 2020
@sebgl sebgl transferred this issue from elastic/cloud-on-k8s Apr 20, 2020
@markharwood markharwood added the :Core/Infra/Plugins Plugin API and infrastructure label Apr 20, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Plugins)

@markharwood
Copy link
Contributor

markharwood commented Apr 20, 2020

Maybe we should change the title - the TL/DR on this is to change "install plugins" so it doesn't error if the same plugin version is already installed.

@rgarcia89
Copy link
Author

rgarcia89 commented Apr 20, 2020

I have create the following workaround for now:

        initContainers:
        - name: install-plugins
          command:
          - sh
          - -c
          - |
            /usr/share/elasticsearch/bin/elasticsearch-plugin list | grep repository-azure
            [[ $? -ne 0 ]] && /usr/share/elasticsearch/bin/elasticsearch-plugin install --batch repository-azure || true
            /usr/share/elasticsearch/bin/elasticsearch-plugin list | grep repository-s3
            [[ $? -ne 0 ]] && /usr/share/elasticsearch/bin/elasticsearch-plugin install --batch repository-s3 || true

@rgarcia89
Copy link
Author

What is also strange is the following:
When I delete a pod of the elastic stack the install-plugins initContainer reports differently

[raulgs@raulgs-xm1 elasticcloud]$ klogs pod/elastic-es-data-1 install-plugins
-> Installing repository-azure
-> Downloading repository-azure from elastic
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.net.SocketPermission * connect,resolve
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.
-> Installed repository-azure
-> Installing repository-s3
-> Downloading repository-s3 from elastic
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.lang.RuntimePermission accessDeclaredMembers
* java.lang.RuntimePermission getClassLoader
* java.lang.reflect.ReflectPermission suppressAccessChecks
* java.net.SocketPermission * connect,resolve
* java.util.PropertyPermission es.allow_insecure_settings read,write
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.
-> Installed repository-s3

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@williamrandolph
Copy link
Contributor

williamrandolph commented Jan 6, 2021

This issue looks like a duplicate of 55443 #61385. [EDIT: pasted the wrong link earlier]

The plan for that issue is to add a flag like --ignore-existing to the elasticsearch-plugin tool so that it could gracefully skip the installation of plugins that already exist. Then, instead of needing to grep the output of elasticsearch-plugin to find out if a plugin needs to be installed, you could just run elasticsearch-plugin install --batch --ignore-existing repository-azure repository-s3, or something to that effect.

I'm going to close this issue in favor of the other one, but please let me know if this is not in fact a duplicate.

@williamrandolph williamrandolph removed the needs:triage Requires assignment of a team area label label Jan 6, 2021
@an-tex
Copy link

an-tex commented Jan 7, 2021

It's actually a duplicate of #61385

@rgarcia89
Copy link
Author

It's actually a duplicate of #61385

thanks for that, I was about to ask

@williamrandolph
Copy link
Contributor

@an-tex Thanks for the correction! That was the link I meant to paste, rather than saying that this issue is a duplicate of itself.

@needleshaped
Copy link

While solved in general, is not yet available for Operator, see elastic/cloud-on-k8s#5145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Plugins Plugin API and infrastructure Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

8 participants