Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add only one master node at a time #1654

Merged
merged 7 commits into from
Sep 2, 2019

Conversation

sebgl
Copy link
Contributor

@sebgl sebgl commented Aug 28, 2019

To better match zen1 and zen2 requirements (especially zen 1),
we should only create one master node at a time.

This is achieved here by:

  • slowly incrementing master StatefulSets replicas, as long as the
    upscaleState allows us to do so
  • inspecting the current nodes at upscaleState initialization. If the
    cluster is not bootstrapped yet, any number of masters can be created.
    Else, we allow master node creation only if there's not a master in the
    process of being created already.
  • because the above point can hardly tolerate an out-of-date cache, we
    check pods vs. StatefulSet spec expectations beforehand.
  • zen1 & zen2 configuration in elasticsearch.yml now needs to be patched
    after adjusting the number of replicas

Some side-effects:

  • cluster bootstrap/UUID annotation is moved to its own file since not
    only related to zen2 anymore
  • downscale expectations checks can be removed, since checked at a
    higher level
  • ESState is shared between upscales and rolling upgrades

@sebgl sebgl added the >enhancement Enhancement of existing functionality label Aug 28, 2019
@sebgl sebgl force-pushed the add-one-master-at-a-time branch from 9894f81 to 4306cbf Compare August 29, 2019 10:33
}

return true, nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When moving to onDelete, I would expect this to become:

  1. check that the expected number of pods is running (same first half of the function)
  2. check our deletions expectations: if we did client.Delete(pod, ...) at some point, we should make sure this pod has indeed disappeared from our cache here. This would replace the second half of the function: we would not really care about revision here. As long as all pods are there, and no deletion is in progress (for potential pod recreation by the sset controller), I think we should be fine?

return results.WithResult(defaultRequeue)
}

expectedResources, err := expectedResources(d.Client, d.ES, observedState, keystoreResources)
expectedResources, err := nodespec.BuildExpectedResources(d.ES, keystoreResources)
if err != nil {
return results.WithError(err)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after this state we may have updated some ssets, so we'll end up with sset update conflicts in next steps (downscale, rolling upgrade). Which is fine, but I'm wondering whether we should maybe either:

  • requeue here if the downscale step did any change
  • re-check expectations here to catch any change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also recheck expectations before downscale and handleRollingUpgrades It would make it explicit and maybe less error prone when this code will be updated.

@sebgl sebgl changed the title WIP: Add only one master node at a time Add only one master node at a time Aug 29, 2019
@sebgl sebgl marked this pull request as ready for review August 29, 2019 10:43
actual, alreadyExists := actualStatefulSets.GetByName(ssetToApply.Name)
if alreadyExists {
ssetToApply = adaptForExistingStatefulSet(actual, ssetToApply)
if err := sset.ReconcileStatefulSet(ctx.k8sClient, ctx.scheme, ctx.es, res.StatefulSet); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not update the expectation for this statefulset if it is updated ?
e.g.:

  1. an upscale from 2 to 3 masters has just be done
  2. in the meantime a new statefulset with some new masters has been added by the user
  3. if we have a stale cache for the first statefulset we would add a fourth master with a m_m_n of 2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My above comment assumes that the new statefulset is at the top of the list returned by adjustResources and would have precedence over the first one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct. Also maybe I am missing something but it looks to me like the limiting is only happening on replica increases of existing ssets but does not limit the addition of completely new ones

Copy link
Collaborator

@pebrc pebrc Sep 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes so I just confirmed that my reading of the code is correct:
If you go from

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: elasticsearch-test
spec:
  version: 7.3.0
  nodes:
  - name: master-1
    config:
      node.master: true
      node.data: true
    nodeCount: 3

to

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: elasticsearch-test
spec:
  version: 7.3.0
  nodes:
  - name: master-1
    config:
      node.master: true
      node.data: true
    nodeCount: 3
  - name: master-2
    config:
      node.master: true
      node.data: true
    nodeCount: 3

the new master nodes in the master-2 sset will be created all at once.

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @barkbay that we are going to merge as is, so that work on #1442 can build on top of it. We will create two follow up issues to track the issue around expectations on new stateful sets and limiting master creation on new stateful sets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants