Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

Commit

Permalink
[elasticsearch] Fix issue with readinessProbe causing outages (#638)
Browse files Browse the repository at this point in the history
  • Loading branch information
Gavin Williams authored May 28, 2020
1 parent 17ab273 commit ca1a13c
Showing 2 changed files with 27 additions and 10 deletions.
7 changes: 7 additions & 0 deletions BREAKING_CHANGES.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,7 @@


- [7.7.0 - 2020/05/13](#770---20200513)
- [Known Issues](#known-issues)
- [GA support](#ga-support)
- [New branching model](#new-branching-model)
- [Filebeat container inputs](#filebeat-container-inputs)
@@ -26,6 +27,11 @@

## 7.7.0 - 2020/05/13

### Known Issues

Elasticsearch nodes could be restarted too quickly during an upgrade or rolling restart, potentially resulting in service disruption.
This is due to a bug introduced by the changes to the Elasticsearch `readinessProbe` in [#586][].

### GA support

Elasticsearch, Kibana, Filebeat and Metricbeat are moving from beta to GA and
@@ -163,6 +169,7 @@ volumeClaimTemplate:
[#540]: https://github.com/elastic/helm-charts/pull/540
[#568]: https://github.com/elastic/helm-charts/pull/568
[#572]: https://github.com/elastic/helm-charts/pull/572
[#586]: https://github.com/elastic/helm-charts/pull/586
[#621]: https://github.com/elastic/helm-charts/pull/621
[container input]: https://www.elastic.co/guide/en/beats/filebeat/7.7/filebeat-input-container.html
[docker input]: https://www.elastic.co/guide/en/beats/filebeat/7.7/filebeat-input-docker.html
30 changes: 20 additions & 10 deletions elasticsearch/templates/statefulset.yaml
Original file line number Diff line number Diff line change
@@ -213,22 +213,32 @@ spec:
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: '{{ .Values.clusterHealthCheckParams }}' )
# If the node is starting up wait for the cluster to be ready (request params: "{{ .Values.clusterHealthCheckParams }}" )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
http () {
local path="${1}"
local args="${2}"
set -- -XGET -s
if [ "$args" != "" ]; then
set -- "$@" $args
fi
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl --output /dev/null -k "$@" "{{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}${path}"
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
HTTP_CODE=$(curl -XGET -s -k ${BASIC_AUTH} -o /dev/null -w '%{http_code}' {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/)
HTTP_CODE=$(http "/" "-w %{http_code}")
RC=$?
if [[ ${RC} -ne 0 ]]; then
echo "curl -XGET -s -k \${BASIC_AUTH} -o /dev/null -w '%{http_code}' {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with RC ${RC}"
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with RC ${RC}"
exit ${RC}
fi
# ready if HTTP code 200, 503 is tolerable if ES version is 6.x
@@ -237,13 +247,13 @@ spec:
elif [[ ${HTTP_CODE} == "503" && "{{ include "elasticsearch.esMajorVersion" . }}" == "6" ]]; then
exit 0
else
echo "curl -XGET -s -k \${BASIC_AUTH} -o /dev/null -w '%{http_code}' {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with HTTP code ${HTTP_CODE}"
echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/ failed with HTTP code ${HTTP_CODE}"
exit 1
fi
else
echo 'Waiting for elasticsearch cluster to become ready (request params: "{{ .Values.clusterHealthCheckParams }}" )'
if curl -XGET -s -k --fail ${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}/_cluster/health?{{ .Values.clusterHealthCheckParams }} ; then
if http "/_cluster/health?{{ .Values.clusterHealthCheckParams }}" "--fail" ; then
touch ${START_FILE}
exit 0
else

0 comments on commit ca1a13c

Please sign in to comment.