-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorten the reconciliation loop duration if Elasticsearch is down #4938
Merged
thbkrkr
merged 13 commits into
elastic:master
from
thbkrkr:short-circuit-reconciliation-if-es-is-down
Oct 19, 2021
Merged
Shorten the reconciliation loop duration if Elasticsearch is down #4938
thbkrkr
merged 13 commits into
elastic:master
from
thbkrkr:short-circuit-reconciliation-if-es-is-down
Oct 19, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pebrc
previously approved these changes
Oct 12, 2021
pebrc
dismissed
their stale review
October 12, 2021 08:45
I think I missed a case where this is a problem e.g. if 2 out of 3 nodes a boot looping
pebrc
previously approved these changes
Oct 15, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sebgl
reviewed
Oct 15, 2021
thbkrkr
force-pushed
the
short-circuit-reconciliation-if-es-is-down
branch
from
October 19, 2021 09:45
f4d32cb
to
397313a
Compare
thbkrkr
force-pushed
the
short-circuit-reconciliation-if-es-is-down
branch
from
October 19, 2021 10:06
397313a
to
f47c3a2
Compare
I think the last version covers all the feedbacks:
|
pebrc
reviewed
Oct 19, 2021
pebrc
approved these changes
Oct 19, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Peter Brachwitz <peter.brachwitz@gmail.com>
thbkrkr
changed the title
Short-circuit reconciliation if Elasticsearch is down
Shorten the reconciliation loop duration if Elasticsearch is down
Oct 19, 2021
fantapsody
pushed a commit
to fantapsody/cloud-on-k8s
that referenced
this pull request
Feb 7, 2023
…astic#4938) This commit allows for a reconciliation loop that takes 30sec instead of 1min30sec when Elasticsearch is down and immediatly update the Elasticsearch status instead of waiting another reconciliation loop. The first call to the ES API is now to get the license (extracted from the license reconciliation) and is used to update the `esReachable` bool. Then we skip the steps that requires ES up (remote clusters settings, update annotation with cluster uuid) using `esReachable`. The Elasticsearch status is updated as soon as we got an error when calling ES with the last observed state. `observedState` is changed to be able to get dynamically the last observer state just before updating the status, otherwise we have to wait for another reconciliation because at the beginning of the ES reconciliation that will fail because ES is down, the last observer state is very often still green because the state was retrieved between 0 and 10 secondes ago.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit allows for a reconciliation loop that takes 30sec instead of 1min30sec when Elasticsearch is down and immediatly update the Elasticsearch status instead of waiting another reconciliation loop.
The first call to the ES API is now to get the license (extracted from the license reconciliation) and is used to update the
esReachable
bool. Then we skip the steps that requires ES up (remote clusters settings, update annotation with cluster uuid) usingesReachable
. The Elasticsearch status is updated as soon as we got an error when calling ES with the last observed state.observedState
is changed to be able to get dynamically the last observer state just before updating the status, otherwise we have to wait for another reconciliation because at the beginning of the ES reconciliation that will fail because ES is down, the last observer state is very often still green because the state was retrieved between 0 and 10 secondes ago.Example timeline:
For testing, deploy an ES with a strict pod anti-affinity rule:
Before this change, it took ~3min50sec (including 50sec for the ES shutdown + 2 * (3 * 30)sec reconciliation loops).
With this change, it takes ~1min20sec (including 50sec for the ES shutdown + 1 * (1 * 30)sec)
Resolves #3496.
Resolves #2939.