Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix how cluster.initial_master_nodes is set #2315

Merged
merged 5 commits into from
Jan 2, 2020

Conversation

sebgl
Copy link
Contributor

@sebgl sebgl commented Dec 20, 2019

Fixes #2291.

We had a bug where the value of cluster.initial_master_nodes would not
be set while upgrading a single v6 cluster to v7. This commit fixes it
by handling cluster.initial_master_nodes differently.

:: Annotation logic

Instead of relyin on the ClusterUUID annotation to know whether a
cluster is bootstrapped, and manually removing on the special case of a
single v6 -> v7 annotation, we now rely on a dedicated annotation:
elasticsearch.k8s.elastic.co/initial-master-nodes: node-0,node-1,node-2.

When that annotation is set at the cluster level, it means the cluster
is currently bootstrapping for zen2. Any v7+ master node must have
cluster.initial_master_nodes in its configuration set to the value in
the annotation. This value is not supposed to vary over time, we make
sure that is the case here.

Once we detect the cluster has finished its bootstrap (the current
master of the cluster is a v7+ master node), we remove the annotation,
and remove the configuration setting from master nodes.

:: When should cluster.initial_master_nodes be set?

There are 2 cases where this setting must be set:

  • when we create a v7+ cluster for the first time
  • when we upgrade/restart a single v6 master to v7: that new master must
    identify itself as a legit master

We don't want to set this setting when:

  • we do a regular rolling-upgrade of multiple master nodes from v6 to v7
  • Elasticsearch is not running in v7+

:: Edge cases

The following cases are quite tricky and covered by unit & E2E tests:

  • upgrade from a single v6 master to a single v7 master in a different
    NodeSet
    . The new master will be created before the old one gets
    removed, hence the setting should not be set.
  • upgrade from a single v6 master to more v7 masters. The new v7 masters
    will be created before the v6 master is upgraded, hence the setting
    should not be set.
  • upgrade from two v6 masters to two v7 masters. This is a regular
    rolling upgrade, hence the setting should not be set. However when the
    first master goes down for upgrade, the cluster becomes unavailable
    since minimum_master_nodes=2.

We had a bug where the value of `cluster.initial_master_nodes` would not
be set while upgrading a single v6 cluster to v7. This commit fixes it
by handling `cluster.initial_master_nodes` differently.

:: Annotation logic

Instead of relyin on the ClusterUUID annotation to know whether a
cluster is bootstrapped, and manually removing on the special case of a
single v6 -> v7 annotation, we now rely on a dedicated annotation:
`elasticsearch.k8s.elastic.co/initial-master-nodes: node-0,node-1,node-2`.

When that annotation is set at the cluster level, it means the cluster
is currently bootstrapping for zen2. Any v7+ master node must have
`cluster.initial_master_nodes` in its configuration set to the value in
the annotation. This value is not supposed to vary over time, we make
sure that is the case here.

Once we detect the cluster has finished its bootstrap (the current
master of the cluster is a v7+ master node), we remove the annotation,
and remove the configuration setting from master nodes.

:: When should cluster.initial_master_nodes be set?

There are 2 cases where this setting must be set:
- when we create a v7+ cluster for the first time
- when we upgrade/restart a single v6 master to v7: that new master must
identify itself as a legit master

We don't want to set this setting when:
- we do a regular rolling-upgrade of multiple master nodes from v6 to v7
- Elasticsearch is not running in v7+

:: Edge cases

The following cases are quite tricky and covered by unit & E2E tests:

* upgrade from a single v6 master to a single v7 master **in a different
NodeSet**. The new master will be created before the old one gets
removed, hence the setting should not be set.
* upgrade from a single v6 master to more v7 masters. The new v7 masters
will be created before the v6 master is upgraded, hence the setting
should not be set.
* upgrade from two v6 masters to two v7 masters. This is a regular
rolling upgrade, hence the setting should not be set. However when the
first master goes down for upgrade, the cluster becomes unavailable
since minimum_master_nodes=2.
@sebgl sebgl added the >bug Something isn't working label Dec 20, 2019
@barkbay barkbay added the v1.0.0 label Dec 20, 2019
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a bug, I will submit a patch on this PR.

pkg/controller/elasticsearch/client/model.go Outdated Show resolved Hide resolved
pkg/controller/elasticsearch/client/v6.go Outdated Show resolved Hide resolved
@barkbay barkbay self-assigned this Dec 23, 2019
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
We can discuss the comments I left in separate issues.

@sebgl sebgl requested a review from barkbay January 2, 2020 13:53
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@sebgl sebgl merged commit bed2c56 into elastic:master Jan 2, 2020
barkbay added a commit to barkbay/cloud-on-k8s that referenced this pull request Jan 3, 2020
* Fix how cluster.initial_master_nodes is set

We had a bug where the value of `cluster.initial_master_nodes` would not
be set while upgrading a single v6 cluster to v7. This commit fixes it
by handling `cluster.initial_master_nodes` differently.

:: Annotation logic

Instead of relyin on the ClusterUUID annotation to know whether a
cluster is bootstrapped, and manually removing on the special case of a
single v6 -> v7 annotation, we now rely on a dedicated annotation:
`elasticsearch.k8s.elastic.co/initial-master-nodes: node-0,node-1,node-2`.

When that annotation is set at the cluster level, it means the cluster
is currently bootstrapping for zen2. Any v7+ master node must have
`cluster.initial_master_nodes` in its configuration set to the value in
the annotation. This value is not supposed to vary over time, we make
sure that is the case here.

Once we detect the cluster has finished its bootstrap (the current
master of the cluster is a v7+ master node), we remove the annotation,
and remove the configuration setting from master nodes.

:: When should cluster.initial_master_nodes be set?

There are 2 cases where this setting must be set:
- when we create a v7+ cluster for the first time
- when we upgrade/restart a single v6 master to v7: that new master must
identify itself as a legit master

We don't want to set this setting when:
- we do a regular rolling-upgrade of multiple master nodes from v6 to v7
- Elasticsearch is not running in v7+

:: Edge cases

The following cases are quite tricky and covered by unit & E2E tests:

* upgrade from a single v6 master to a single v7 master **in a different
NodeSet**. The new master will be created before the old one gets
removed, hence the setting should not be set.
* upgrade from a single v6 master to more v7 masters. The new v7 masters
will be created before the v6 master is upgraded, hence the setting
should not be set.
* upgrade from two v6 masters to two v7 masters. This is a regular
rolling upgrade, hence the setting should not be set. However when the
first master goes down for upgrade, the cluster becomes unavailable
since minimum_master_nodes=2.

* Fix V7 comparison in client

* Remove new lines in imports

* Improve readability by extracting checks

* Don't error out if no master yet

Co-authored-by: Michael Morello <michael.morello@gmail.com>
barkbay added a commit that referenced this pull request Jan 3, 2020
…tate across StatefulSets (#2315)(#2339) (#2344)

* Fix how cluster.initial_master_nodes is set (#2315)

Co-authored-by: Michael Morello <michael.morello@gmail.com>

* Reuse the same upscaleState across StatefulSets (#2339)

Co-authored-by: Sebastien Guilloux <contact.sebgl@gmail.com>
mjmbischoff pushed a commit to mjmbischoff/cloud-on-k8s that referenced this pull request Jan 13, 2020
* Fix how cluster.initial_master_nodes is set

We had a bug where the value of `cluster.initial_master_nodes` would not
be set while upgrading a single v6 cluster to v7. This commit fixes it
by handling `cluster.initial_master_nodes` differently.

:: Annotation logic

Instead of relyin on the ClusterUUID annotation to know whether a
cluster is bootstrapped, and manually removing on the special case of a
single v6 -> v7 annotation, we now rely on a dedicated annotation:
`elasticsearch.k8s.elastic.co/initial-master-nodes: node-0,node-1,node-2`.

When that annotation is set at the cluster level, it means the cluster
is currently bootstrapping for zen2. Any v7+ master node must have
`cluster.initial_master_nodes` in its configuration set to the value in
the annotation. This value is not supposed to vary over time, we make
sure that is the case here.

Once we detect the cluster has finished its bootstrap (the current
master of the cluster is a v7+ master node), we remove the annotation,
and remove the configuration setting from master nodes.

:: When should cluster.initial_master_nodes be set?

There are 2 cases where this setting must be set:
- when we create a v7+ cluster for the first time
- when we upgrade/restart a single v6 master to v7: that new master must
identify itself as a legit master

We don't want to set this setting when:
- we do a regular rolling-upgrade of multiple master nodes from v6 to v7
- Elasticsearch is not running in v7+

:: Edge cases

The following cases are quite tricky and covered by unit & E2E tests:

* upgrade from a single v6 master to a single v7 master **in a different
NodeSet**. The new master will be created before the old one gets
removed, hence the setting should not be set.
* upgrade from a single v6 master to more v7 masters. The new v7 masters
will be created before the v6 master is upgraded, hence the setting
should not be set.
* upgrade from two v6 masters to two v7 masters. This is a regular
rolling upgrade, hence the setting should not be set. However when the
first master goes down for upgrade, the cluster becomes unavailable
since minimum_master_nodes=2.

* Fix V7 comparison in client

* Remove new lines in imports

* Improve readability by extracting checks

* Don't error out if no master yet

Co-authored-by: Michael Morello <michael.morello@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working v1.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Single node cluster upgrade from 6.X to 7.X does not have cluster.initial_master_nodes set
3 participants