New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Call _nodes/shutdown from pre-stop hook #6544

Merged

pebrc merged 18 commits into elastic:main from pebrc:pre-stop-improved

Mar 22, 2023

Collaborator

pebrc commented Mar 17, 2023

Fixes #6478

Adds a more complicated shell script to the pre-stop lifecycle hook. This is covers the uses case described in the referenced issue which affect mostly larger clusters with lots of data.

This is obviously a best effort attempt at orchestrating node shutdown more gracefully. Many things can go wrong here. I anticipate that the most common reason why this logic might fail is on slow or not responding ES clusters where we might run into the overall lifecycle hook timeout.

There is some fragile grep'ing over JSON responses here as well because we don't have jq or similar tools available in the Elasticsearch image.


          Call _nodes/shutdown from pre-stop hook

fb7726f

botelastic bot added the triage label

pebrc added 7 commits

March 17, 2023 11:42


          Remove design note

c7c9757


          Document and add overrides via env vars

6110ca3


          Exit code on error and templating

42b7f66


          indentation

3fa9381


          more indentation

429dbfb


          left-over TODO removed

428b883


          Add missing e2e secret check update

c626900

pebrc added the >enhancement label

botelastic bot removed the triage label

pebrc added triage v2.8.0 labels

botelastic bot removed the triage label

barkbay self-assigned this

barkbay approved these changes

View reviewed changes

Contributor

barkbay left a comment

I have not tested all the combinations but LGTM.

Note that this PR will trigger a cluster restart, we should also update https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-upgrading-eck.html#k8s-beta-to-ga-rolling-restart accordingly.

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved

pebrc added 3 commits

March 20, 2023 15:52


          review/shellcheck input

1bc0e91


          update ECK docs to mention restart requirement

7f687a8


          replace echo with log

229b14c

pebrc commented

View reviewed changes

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Show resolved Hide resolved

thbkrkr reviewed

View reviewed changes

docs/orchestrating-elastic-stack-applications/elasticsearch/prestop.asciidoc Outdated Show resolved Hide resolved

pebrc commented

View reviewed changes

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved

pebrc and others added 3 commits

March 20, 2023 21:24


          Do not cancel pre-stop hook originated shutdowns

81d7d16


          Do not set voting config exclusions in pre-stop hook

9b4054f


          Update docs/orchestrating-elastic-stack-applications/elasticsearch/pr…

7ff1372

…estop.asciidoc

Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>

pebrc requested a review from barkbay

March 21, 2023 08:43

EgbertW mentioned this pull request

Operator applying changes breaks index when bulk inserting to non-replicated index #6555

Closed

barkbay approved these changes

View reviewed changes

Contributor

barkbay left a comment

LGTM, I did a bit of manual testing and it seems to work as expected.

As you mentioned in a comment the only thing I'm afraid of would be an unexpected behaviour because the Pod's metadata would not be up to date in the client cache. This adds another "best-effort" layer 😄

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved


          include retry messages in log

00ca6ed

thbkrkr reviewed

View reviewed changes

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved


          Do no automatically exit on error and do not hot loop retries while w…

2cd09e6

…aiting

pebrc requested a review from thbkrkr

March 21, 2023 16:52

thbkrkr reviewed

View reviewed changes

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved


          Always honor PRE_STOP_ADDITIONAL_WAIT_SECONDS, do not log $status we …

bf17123

…already do that

thbkrkr approved these changes

View reviewed changes

Contributor

thbkrkr left a comment

👍

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved

pkg/controller/elasticsearch/nodespec/lifecycle_hook.go Outdated Show resolved Hide resolved


          Improve log message consistency

72569e8

pebrc merged commit f509432 into elastic:main

pebrc mentioned this pull request

When a Pod is deleted outside of ECK (K8s node maintenance, upgrades, etc.) cold tier data isn't maintained on the node #6444

Closed

thbkrkr mentioned this pull request

Non-existent secret key: elastic-internal-pre-stop #7315

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement v2.8.0