Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[elasticsearch] make service configurable #123

Merged
merged 12 commits into from
Jun 12, 2019

Conversation

kimxogus
Copy link
Contributor

@kimxogus kimxogus commented May 3, 2019

  • Chart version not bumped (the versions are all bumped and released at the same time)
  • README.md updated with any new values or changes
  • Updated template tests in ${CHART}/tests/*.py
  • Updated integration tests in ${CHART}/examples/*/test/goss.yaml

In case of migration, master node name's prefix can be different. As we are already getting master node name only, I think checking if string is empty is enough

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@@ -26,11 +34,10 @@ metadata:
release: {{ .Release.Name | quote }}
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
app: "{{ template "uname" . }}"
annotations:
# Create endpoints also if the related pod isn't ready
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
Copy link
Contributor Author

@kimxogus kimxogus May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This annotation is deprecated kubernetes/kubernetes#63742

@kimxogus
Copy link
Contributor Author

kimxogus commented May 3, 2019

also added some service values

@kimxogus kimxogus force-pushed the elasticsearch/fix-master-check branch from 7fda166 to 8fcbca9 Compare May 3, 2019 19:57
@kimxogus kimxogus changed the title check only if master node string is empty check only if master node string is empty and service improvements May 3, 2019
@@ -242,7 +242,7 @@ spec:
cleanup () {
while true ; do
local master="$(http "/_cat/master?h=node")"
if [[ $master == "{{ template "uname" . }}"* && $master != "${NODE_NAME}" ]]; then
if [[ $master && $master != "${NODE_NAME}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this? It just seems less safe than what was already in there. Yes it will fail if there is an empty string, however if the API returns anything else (or some kind of weird error) then it is going to be a non-empty string and this will pass.

Did you find an issue with the current logic that this fixes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I migrated es cluster from a cluster(all nodes have master, ingest and data role) to this chart. Previous master was not prefixed by uname, so I had to wait several minutes during rollout caused by configuration change when nodes from both old and new cluster exists before exclude old nodes from cluster shard allocation.
I think this is unnecessary because --fail option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master won't have a value).
As h=node query param is passed, $master string will only contain node name of master node if it's not empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was my explanation not enough? Then I can restore this because this only happens in specific migration case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary because --fail option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master won't have a value).

Thanks for noticing this. This combination of set -e and curl --fail will mean that this script might exit early if the call fails. Ideally we want this loop to keep on running until we can see that a new master exists that isn't the current pod. Exiting early is not what we want to happen here.

My concern is whether or not the API will return an error when there is no master or not. If the API returns a weird message like "no master yet" with a 200 then it's possible for the script to exit too early. Looking at the code or Elasticsearch it looks like it will actually return a dash (-) if there is no master found.

Then I can restore this because this only happens in specific migration case.

Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?

I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}. That way this will always point to the prefix for the right master even during migrations like in https://github.com/elastic/helm-charts/blob/master/elasticsearch/examples/migration/README.md

Was my explanation not enough?

Sorry for the slow reply. I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?

Yes, I had to wait 120 seconds per each master node.

I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}. That way this will always point to the prefix for the right master even during migrations like in /elasticsearch/examples/migration/README.md@master

Changed to {{ template "masterService" . }}!

I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)

I'm not going to this Kubecon, but I hope to visit and say hi someday :)

elasticsearch/templates/service.yaml Show resolved Hide resolved
elasticsearch/README.md Outdated Show resolved Hide resolved
@Crazybus
Copy link
Contributor

Crazybus commented May 7, 2019

jenkins test this please

@@ -242,7 +242,7 @@ spec:
cleanup () {
while true ; do
local master="$(http "/_cat/master?h=node")"
if [[ $master == "{{ template "uname" . }}"* && $master != "${NODE_NAME}" ]]; then
if [[ $master && $master != "${NODE_NAME}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary because --fail option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master won't have a value).

Thanks for noticing this. This combination of set -e and curl --fail will mean that this script might exit early if the call fails. Ideally we want this loop to keep on running until we can see that a new master exists that isn't the current pod. Exiting early is not what we want to happen here.

My concern is whether or not the API will return an error when there is no master or not. If the API returns a weird message like "no master yet" with a 200 then it's possible for the script to exit too early. Looking at the code or Elasticsearch it looks like it will actually return a dash (-) if there is no master found.

Then I can restore this because this only happens in specific migration case.

Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?

I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}. That way this will always point to the prefix for the right master even during migrations like in https://github.com/elastic/helm-charts/blob/master/elasticsearch/examples/migration/README.md

Was my explanation not enough?

Sorry for the slow reply. I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)

@kimxogus kimxogus changed the title check only if master node string is empty and service improvements [elasticsearch] make service improvements May 20, 2019
@kimxogus kimxogus changed the title [elasticsearch] make service improvements [elasticsearch] make service configurable May 20, 2019
Copy link
Contributor

@Crazybus Crazybus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the table formatting this LGTM!

Thanks again for another great contribution :)

elasticsearch/README.md Outdated Show resolved Hide resolved
@Crazybus
Copy link
Contributor

jenkins test this please

Copy link
Contributor

@Crazybus Crazybus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Crazybus
Copy link
Contributor

jenkins test this please

@Crazybus Crazybus merged commit 3c7a09c into elastic:master Jun 12, 2019
@Crazybus
Copy link
Contributor

Thank you for another great contribution!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants