-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[elasticsearch] make service configurable #123
[elasticsearch] make service configurable #123
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@@ -26,11 +34,10 @@ metadata: | |||
release: {{ .Release.Name | quote }} | |||
chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | |||
app: "{{ template "uname" . }}" | |||
annotations: | |||
# Create endpoints also if the related pod isn't ready | |||
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This annotation is deprecated kubernetes/kubernetes#63742
also added some service values |
7fda166
to
8fcbca9
Compare
@@ -242,7 +242,7 @@ spec: | |||
cleanup () { | |||
while true ; do | |||
local master="$(http "/_cat/master?h=node")" | |||
if [[ $master == "{{ template "uname" . }}"* && $master != "${NODE_NAME}" ]]; then | |||
if [[ $master && $master != "${NODE_NAME}" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this? It just seems less safe than what was already in there. Yes it will fail if there is an empty string, however if the API returns anything else (or some kind of weird error) then it is going to be a non-empty string and this will pass.
Did you find an issue with the current logic that this fixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I migrated es cluster from a cluster(all nodes have master, ingest and data role) to this chart. Previous master was not prefixed by uname
, so I had to wait several minutes during rollout caused by configuration change when nodes from both old and new cluster exists before exclude old nodes from cluster shard allocation.
I think this is unnecessary because --fail
option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master
won't have a value).
As h=node
query param is passed, $master
string will only contain node name of master node if it's not empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was my explanation not enough? Then I can restore this because this only happens in specific migration case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary because --fail option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master won't have a value).
Thanks for noticing this. This combination of set -e
and curl --fail
will mean that this script might exit early if the call fails. Ideally we want this loop to keep on running until we can see that a new master exists that isn't the current pod. Exiting early is not what we want to happen here.
My concern is whether or not the API will return an error when there is no master or not. If the API returns a weird message like "no master yet" with a 200 then it's possible for the script to exit too early. Looking at the code or Elasticsearch it looks like it will actually return a dash (-
) if there is no master found.
Then I can restore this because this only happens in specific migration case.
Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?
I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}
. That way this will always point to the prefix for the right master even during migrations like in https://github.com/elastic/helm-charts/blob/master/elasticsearch/examples/migration/README.md
Was my explanation not enough?
Sorry for the slow reply. I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?
Yes, I had to wait 120 seconds per each master node.
I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}. That way this will always point to the prefix for the right master even during migrations like in /elasticsearch/examples/migration/README.md@master
Changed to {{ template "masterService" . }}
!
I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)
I'm not going to this Kubecon, but I hope to visit and say hi someday :)
jenkins test this please |
@@ -242,7 +242,7 @@ spec: | |||
cleanup () { | |||
while true ; do | |||
local master="$(http "/_cat/master?h=node")" | |||
if [[ $master == "{{ template "uname" . }}"* && $master != "${NODE_NAME}" ]]; then | |||
if [[ $master && $master != "${NODE_NAME}" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary because --fail option is passed to curl(https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L239), script will exit if an error is returned from API($master won't have a value).
Thanks for noticing this. This combination of set -e
and curl --fail
will mean that this script might exit early if the call fails. Ideally we want this loop to keep on running until we can see that a new master exists that isn't the current pod. Exiting early is not what we want to happen here.
My concern is whether or not the API will return an error when there is no master or not. If the API returns a weird message like "no master yet" with a 200 then it's possible for the script to exit too early. Looking at the code or Elasticsearch it looks like it will actually return a dash (-
) if there is no master found.
Then I can restore this because this only happens in specific migration case.
Did it just mean that Kubernetes waited for the 120 second timeout while stopping each of the masters?
I think what actually makes a lot more sense is to check that it starts with {{ template "masterService" . }}
. That way this will always point to the prefix for the right master even during migrations like in https://github.com/elastic/helm-charts/blob/master/elasticsearch/examples/migration/README.md
Was my explanation not enough?
Sorry for the slow reply. I'm currently on a work trip and only have a few quiet moments to sneak some work in. I'll be at Kubecon in Barcelona next week too so things will be slow for a little while. If you are at Kubecon come and say hi at the Elastic booth :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the table formatting this LGTM!
Thanks again for another great contribution :)
jenkins test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
jenkins test this please |
Thank you for another great contribution! |
${CHART}/tests/*.py
${CHART}/examples/*/test/goss.yaml
In case of migration, master node name's prefix can be different. As we are already getting master node name only, I think checking if string is empty is enough