Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[elasticsearch] Readiness probe is failing with 8.0.0-SNAPSHOT and default config #1375

Closed
jmlrt opened this issue Sep 22, 2021 · 5 comments
Closed
Labels
bug Something isn't working elasticsearch

Comments

@jmlrt
Copy link
Member

jmlrt commented Sep 22, 2021

Chart version: 8.0.0-SNAPSHOT

Kubernetes version: all

Kubernetes provider: all

Helm Version: all

helm get release output:

Output of helm get release
NAME: helm-es-default
LAST DEPLOYED: Wed Sep 22 15:49:36 2021
NAMESPACE: default
STATUS: failed
REVISION: 1
USER-SUPPLIED VALUES:
null

COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: ""
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
healthNameOverride: ""
hostAliases: []
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 8.0.0-SNAPSHOT
ingress:
  annotations: {}
  enabled: false
  hosts:
  - host: chart-example.local
    paths:
    - path: /
  tls: []
initResources: {}
keystore: []
labels: {}
lifecycle: {}
masterService: ""
maxUnavailable: 1
minimumMasterNodes: 2
nameOverride: ""
networkHost: 0.0.0.0
networkPolicy:
  http:
    enabled: false
  transport:
    enabled: false
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
  annotations: {}
  enabled: true
  labels:
    enabled: false
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000
podSecurityPolicy:
  create: false
  name: ""
  spec:
    fsGroup:
      rule: RunAsAny
    privileged: true
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
    - secret
    - configMap
    - persistentVolumeClaim
    - emptyDir
priorityClassName: ""
protocol: http
rbac:
  create: false
  serviceAccountAnnotations: {}
  serviceAccountName: ""
readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  successThreshold: 3
  timeoutSeconds: 5
replicas: 3
resources:
  limits:
    cpu: 1000m
    memory: 2Gi
  requests:
    cpu: 1000m
    memory: 2Gi
roles:
- master
- data
- data_content
- data_hot
- data_warm
- data_cold
- ingest
- ml
- remote_cluster_client
- transform
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  enabled: true
  externalTrafficPolicy: ""
  httpPortName: http
  labels: {}
  labelsHeadless: {}
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  nodePort: ""
  transportPortName: transport
  type: ClusterIP
sysctlInitContainer:
  enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tests:
  enabled: true
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

HOOKS:
---
# Source: elasticsearch/templates/test/test-elasticsearch-health.yaml
apiVersion: v1
kind: Pod
metadata:
  name: "helm-es-default-ncnzd-test"
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  securityContext:
    fsGroup: 1000
    runAsUser: 1000
  containers:
  - name: "helm-es-default-rxlbg-test"
    image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
    imagePullPolicy: "IfNotPresent"
    command:
      - "sh"
      - "-c"
      - |
        #!/usr/bin/env bash -e
        curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
  restartPolicy: Never
MANIFEST:
---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: "elasticsearch-master-pdb"
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Helm"
    release: "helm-es-default"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    {}
spec:
  type: ClusterIP
  selector:
    release: "helm-es-default"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  ports:
  - name: http
    protocol: TCP
    port: 9200
  - name: transport
    protocol: TCP
    port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master-headless
  labels:
    heritage: "Helm"
    release: "helm-es-default"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
  # Create endpoints also if the related pod isn't ready
  publishNotReadyAddresses: true
  selector:
    app: "elasticsearch-master"
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Helm"
    release: "helm-es-default"
    chart: "elasticsearch"
    app: "elasticsearch-master"
  annotations:
    esMajorVersion: "8"
spec:
  serviceName: elasticsearch-master-headless
  selector:
    matchLabels:
      app: "elasticsearch-master"
  replicas: 3
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-master
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 30Gi
  template:
    metadata:
      name: "elasticsearch-master"
      labels:
        release: "helm-es-default"
        chart: "elasticsearch"
        app: "elasticsearch-master"
      annotations:
        
    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "elasticsearch-master"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      enableServiceLinks: true
      initContainers:
      - name: configure-sysctl
        securityContext:
          runAsUser: 0
          privileged: true
        image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
        imagePullPolicy: "IfNotPresent"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        resources:
          {}

      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
        image: "docker.elastic.co/elasticsearch/elasticsearch:8.0.0-SNAPSHOT"
        imagePullPolicy: "IfNotPresent"
        readinessProbe:
          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                # Disable nss cache to avoid filling dentry cache when calling curl
                # This is required with Elasticsearch Docker using nss < 3.52
                export NSS_SDB_USE_CACHE=no

                http () {
                  local path="${1}"
                  local args="${2}"
                  set -- -XGET -s

                  if [ "$args" != "" ]; then
                    set -- "$@" $args
                  fi

                  if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                    set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                  fi

                  curl --output /dev/null -k "$@" "http://127.0.0.1:9200${path}"
                }

                if [ -f "${START_FILE}" ]; then
                  echo 'Elasticsearch is already running, lets check the node is healthy'
                  HTTP_CODE=$(http "/" "-w %{http_code}")
                  RC=$?
                  if [[ ${RC} -ne 0 ]]; then
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with RC ${RC}"
                    exit ${RC}
                  fi
                  # ready if HTTP code 200, 503 is tolerable if ES version is 6.x
                  if [[ ${HTTP_CODE} == "200" ]]; then
                    exit 0
                  elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then
                    exit 0
                  else
                    echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
                    exit 1
                  fi

                else
                  echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
                  if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
                    touch ${START_FILE}
                    exit 0
                  else
                    echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
                    exit 1
                  fi
                fi
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1000m
            memory: 2Gi
          requests:
            cpu: 1000m
            memory: 2Gi
        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: cluster.initial_master_nodes
            value: "elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,"
          - name: node.roles
            value: "master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform,"
          - name: discovery.seed_hosts
            value: "elasticsearch-master-headless"
          - name: cluster.name
            value: "elasticsearch"
          - name: network.host
            value: "0.0.0.0"
        volumeMounts:
          - name: "elasticsearch-master"
            mountPath: /usr/share/elasticsearch/data

NOTES:
1. Watch all cluster members come up.
  $ kubectl get pods --namespace=default -l app=elasticsearch-master -w2. Test cluster health using Helm test.
  $ helm --namespace=default test helm-es-default

Describe the bug:

When using 8.0.0-SNAPSHOT and default values (default elasticsearch config, security not enforced), Elasticsearch chart fails to deploy, with pods never reaching ready state due to Readiness probe failing:

$ kubectl describe pod elasticsearch-master-0
...
  Normal   Started                 23m                    kubelet                  Started container elasticsearch
  Warning  Unhealthy               4m21s (x114 over 23m)  kubelet                  Readiness probe failed: Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )

This seems to be related to the new behavior where Elasticsearch enables security and generates password if it's not configured. Without getting too deep into the weed, I think that's because the generated credentials are not passed to the readiness probe scripts:

- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: "{{ .Values.clusterHealthCheckParams }}" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
# Disable nss cache to avoid filling dentry cache when calling curl
# This is required with Elasticsearch Docker using nss < 3.52
export NSS_SDB_USE_CACHE=no
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$@" "{{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
if [[ ${HTTP_CODE} == "200" ]]; then
exit 0
elif [[ ${HTTP_CODE} == "503" && "{{ include "elasticsearch.esMajorVersion" . }}" == "6" ]]; then
exit 0
else
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "{{ .Values.clusterHealthCheckParams }}" )'
if http "/_cluster/health?{{ .Values.clusterHealthCheckParams }}" "--fail" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "{{ .Values.clusterHealthCheckParams }}" )'
exit 1
fi

Steps to reproduce:

  1. Try deploy Elasticsearch chart with default values from master branch
$ cd helm-charts
$ git checkout master
$ helm install es ./elasticsearch
  1. That's all

Expected behavior: Readiness probe should success.

Provide logs and/or server output (if relevant):

Elasticsearch logs
{"@timestamp":"2021-09-22T13:51:34.244Z", "log.level": "INFO", "message":"added {{elasticsearch-master-2}{grA2us18Sfa19w_WJ1EYow}{WFDAR5wXRHa0pWecXpZfQA}{10.4.8.4}{10.4.8.4:9300}{cdhilmrstw}}, term: 1, version: 26, reason: ApplyCommitRequest{term=1, version=26, sourceNode={elasticsearch-master-1}{ZiR5zfJIQsKDVtGmUmC4pA}{0VDEJ0PNQ8mG1ErLczvJJg}{10.4.4.3}{10.4.4.3:9300}{cdhilmrstw}{ml.machine_memory=2147483648, ml.max_open_jobs=512, xpack.installed=true, ml.max_jvm_size=1073741824}}", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.service.ClusterApplierService","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}

-----------------------------------------------------------------

Password for the elastic user is: _y_G+hZWyYds8F*TjXv8

Password for the kibana_system user is: CYkFjJ61BcUlSL3_jhMH

Please note these down as they will not be shown again.


You can use 'bin/elasticsearch-reset-elastic-password' at any time
in order to reset the password for the elastic user.

You can use 'bin/elasticsearch-reset-kibana-system-password' at any time
in order to reset the password for the kibana_system user.

-----------------------------------------------------------------

{"@timestamp":"2021-09-22T13:51:37.896Z", "log.level": "INFO", "message":"updating geoip databases", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:37.896Z", "log.level": "INFO", "message":"fetching geoip databases overview from [https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:38.034Z", "log.level": "INFO", "message":"license [e0a56fa8-a289-4ea2-8083-584fef2043bd] mode [basic] - valid", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.license.LicenseService","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:38.035Z", "log.level": "INFO", "message":"license mode is [basic], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.xpack.security.authc.Realms","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:38.899Z", "log.level": "INFO", "message":"updating geoip database [GeoLite2-ASN.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:41.907Z", "log.level": "INFO", "message":"downloading geoip database [GeoLite2-ASN.mmdb] to [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-ASN.mmdb.tmp.gz]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:42.003Z", "log.level": "INFO", "message":"updated geoip database [GeoLite2-ASN.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:42.023Z", "log.level": "INFO", "message":"updating geoip database [GeoLite2-City.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:42.461Z", "log.level": "INFO", "message":"successfully reloaded changed geoip database file [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-ASN.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#1]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:46.888Z", "log.level": "INFO", "message":"downloading geoip database [GeoLite2-City.mmdb] to [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-City.mmdb.tmp.gz]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:46.910Z", "log.level": "INFO", "message":"updated geoip database [GeoLite2-City.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:46.913Z", "log.level": "INFO", "message":"updating geoip database [GeoLite2-Country.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:48.315Z", "log.level": "INFO", "message":"downloading geoip database [GeoLite2-Country.mmdb] to [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-Country.mmdb.tmp.gz]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:48.496Z", "log.level": "INFO", "message":"updated geoip database [GeoLite2-Country.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#2]","log.logger":"org.elasticsearch.ingest.geoip.GeoIpDownloader","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:48.601Z", "log.level": "INFO", "message":"successfully reloaded changed geoip database file [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-Country.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#4]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}
{"@timestamp":"2021-09-22T13:51:49.309Z", "log.level": "INFO", "message":"successfully reloaded changed geoip database file [/tmp/elasticsearch-11663057740847801878/geoip-databases/TRJIoKSARKqhW9aHwg-1Eg/GeoLite2-City.mmdb]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[elasticsearch-master-0][generic][T#1]","log.logger":"org.elasticsearch.ingest.geoip.DatabaseRegistry","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"Yopr_NEFQr6mBv8Da2yIOA","elasticsearch.node.id":"TRJIoKSARKqhW9aHwg-1Eg","elasticsearch.node.name":"elasticsearch-master-0","elasticsearch.cluster.name":"elasticsearch"}

Any additional context:

@jmlrt jmlrt added bug Something isn't working elasticsearch labels Sep 22, 2021
@jmlrt
Copy link
Member Author

jmlrt commented Sep 22, 2021

cc @elastic/es-delivery

@mark-vieira
Copy link

Hmm, what can we do here? I suspect we should be explicit, that is, make users choose to either enable to disable security, and if they enable, provide the necessary credentials.

cc @jkakavas

@jkakavas
Copy link
Member

Thanks for bringing this to our attention!

Hmm, what can we do here? I suspect we should be explicit, that is, make users choose to either enable to disable security, and if they enable, provide the necessary credentials.

Yes, I agree. Auto-configuration is helpful but it is not aimed to cover the use cases where there is some other form of orchestration. My helm knowledge is somewhat limited, but I think the way forward here would be to include a section like

extraEnvs:
  - name: ELASTIC_PASSWORD
    valueFrom:
      secretKeyRef:
        name: elastic-credentials
        key: password

in all of our examples ( as we do for https://github.com/elastic/helm-charts/blob/master/elasticsearch/examples/security/values.yaml ) and that will be used for the readiness probe too. Setting this would also trigger elasticsearch to not auto-generate a new password for the elastic user.

@jmlrt
Copy link
Member Author

jmlrt commented Sep 23, 2021

Yes, I can confirm that settings ELASTIC_USERNAME and ELASTIC_PASSWORD variables via extraEnvs values work fine and is similar to what we are already doing in the security example.

The goal of testing the default example in Jenkins, is to ensure that deploying Elasticsearch chart with the default config using helm install elasticsearch elastic/elasticsearch work without having to configure anything else. With security auto-configuration in 8.0.0-SNAPSHOT, this is not possible anymore as the credentials generated by Elasticsearch can not be passed to Helm to be reused in the readiness probe script.

If we want to make auto-configuration compatible with Elasticsearch chart, we should be able to query http://127.0.0.1:9200/_cluster/health endpoint without credentials so that readiness probe don't need to know the credentials or provide a similar endpoint to use as an unauthenticated healthcheck. Do you think this could make sense?

Otherwise, we can just consider that Elasticsearch chart shouldn't work anymore with only default config and document that security auto-configuration is not compatible with this charts and that setting ELASTIC_USERNAME and ELASTIC_PASSWORD variables via extraEnvs values is mandatory with 8.0.0.

WDYT @jkakavas?

@jkakavas
Copy link
Member

Otherwise, we can just consider that Elasticsearch chart shouldn't work anymore with only default config and document that security auto-configuration is not compatible with this charts and that setting ELASTIC_USERNAME and ELASTIC_PASSWORD variables via extraEnvs values is mandatory with 8.0.0.

I think we should be documenting this as our preferred approach to use the helm chart. Mind, you don't need to set ELASTIC_USERNAME as we can hardcode this to be elastic and a user would only need to set ELASTIC_PASSWORD. Security auto-configuration is/can be compatible with the charts, but not with the readiness probe I guess. Still, we have the established practice for docker to be passing ELASTIC_PASSWORD env var so this might be less surprising to the folks that use the helm charts.

If we want to make auto-configuration compatible with Elasticsearch chart, we should be able to query http://127.0.0.1:9200/_cluster/health endpoint without credentials so that readiness probe don't need to know the credentials or provide a similar endpoint to use as an unauthenticated healthcheck. Do you think this could make sense?

There is no unauthenticated health check endpoint for ES. If we query the http://127.0.0.1:9200/_cluster/health, we'll just get a 401. One way around this is for the chart to enable anonymous access and define a file based role that only has access to the health check action and assign this role for the anonymous access. Then the readiness probe can succeed. The deploying part will "work" but the user would have to know to check the elasticsearch output to get their password ( I assume that there is no tty attached for elasticsearch in this use case and we're actively working on not generating credentials in that case ) or use our tooling to set/reset this to a new one.

Given the opportunity, I also want to bring elastic/elasticsearch#77231 to your attention. In that PR, we are enabling auto-configuration of TLS so we wouldn't need the make secrets part of the security chart anymore

jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit update Elasticsearch chart to use security by default.

- Adds a new Secret templates for Elasticsearch credentials with a
  randomized password if password value isn't defined.

- Adds instructions to retrieve credentials in Elasticsearch chart
  deployment notes.

The other charts will be updated in follow-up PRs to use the proper
credentials

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit updates apm-server values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit updates filebeat values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit updates kibana values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit updates logstash values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 7, 2021
This commit updates metricbeat values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit that referenced this issue Oct 12, 2021
This commit update Elasticsearch chart to use security by default.

- Adds a new Secret template for Elasticsearch password with a
  randomized password if `secret.password` isn't defined.

- Adds instructions to retrieve the password in Elasticsearch chart
  deployment notes.

- Also, remove usage of `ELASTIC_USERNAME` variable because it
  don't seem to be supported anymore by Elasticsearch 

The other charts will be updated in follow-up PRs to use the proper
credentials

Relates to #1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit update Elasticsearch chart to use security by default.

- Adds a new Secret template for Elasticsearch password with a
  randomized password if `secret.password` isn't defined.

- Adds instructions to retrieve the password in Elasticsearch chart
  deployment notes.

- Also, remove usage of `ELASTIC_USERNAME` variable because it
  don't seem to be supported anymore by Elasticsearch

The other charts will be updated in follow-up PRs to use the proper
credentials

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit updates apm-server values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit updates filebeat values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit updates kibana values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit updates logstash values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375#
jmlrt added a commit to jmlrt/helm-charts that referenced this issue Oct 12, 2021
This commit updates metricbeat values to use the new Elasticsearch
credentials from elastic#1384.

Relates to elastic#1375
jmlrt added a commit that referenced this issue Oct 12, 2021
This commit updates apm-server values to use the new Elasticsearch
credentials from #1384.

Relates to #1375
jmlrt added a commit that referenced this issue Oct 12, 2021
This commit updates metricbeat values to use the new Elasticsearch
credentials from #1384.

Relates to #1375
jmlrt added a commit that referenced this issue Oct 12, 2021
This commit updates kibana values to use the new Elasticsearch
credentials from #1384.

Relates to #1375
jmlrt added a commit that referenced this issue Oct 13, 2021
* [filebeat] use new elasticsearch credentials

This commit updates filebeat values to use the new Elasticsearch
credentials from #1384.

Relates to #1375

* fixup! [filebeat] use new elasticsearch credentials

* fixup! fixup! [filebeat] use new elasticsearch credentials
jmlrt added a commit that referenced this issue Oct 13, 2021
* [logstash] use new elasticsearch credentials

This commit updates logstash values to use the new Elasticsearch
credentials from #1384.

Relates to #1375#

* fixup! [logstash] use new elasticsearch credentials
galina-tochilkin pushed a commit to mtp-devops/3d-party-helm that referenced this issue Dec 20, 2022
This commit update Elasticsearch chart to use security by default.

- Adds a new Secret template for Elasticsearch password with a
  randomized password if `secret.password` isn't defined.

- Adds instructions to retrieve the password in Elasticsearch chart
  deployment notes.

- Also, remove usage of `ELASTIC_USERNAME` variable because it
  don't seem to be supported anymore by Elasticsearch 

The other charts will be updated in follow-up PRs to use the proper
credentials

Relates to elastic/helm-charts#1375
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working elasticsearch
Projects
None yet
Development

No branches or pull requests

4 participants