Elasticsearch must be upgraded before the APM Server #2426

thbkrkr · 2020-01-14T16:15:33Z

What did you do?

Deploy Elasticsearch and APM Server in version 7.4.0

Manifest

```yaml apiVersion: elasticsearch.k8s.elastic.co/v1 kind: Elasticsearch metadata: name: es-apm-sample spec: version: 7.4.0 nodeSets: - name: default count: 3 config: node.store.allow_mmap: false --- apiVersion: apm.k8s.elastic.co/v1 kind: ApmServer metadata: name: apm-apm-sample spec: version: 7.4.0 count: 1 elasticsearchRef: name: "es-apm-sample" ```

Upgrade Elasticsearch and APM Server to version 7.5.0

Manifest

```yaml apiVersion: elasticsearch.k8s.elastic.co/v1 kind: Elasticsearch metadata: name: es-apm-sample spec: version: 7.5.0 nodeSets: - name: default count: 3 config: node.store.allow_mmap: false --- apiVersion: apm.k8s.elastic.co/v1 kind: ApmServer metadata: name: apm-apm-sample spec: version: 7.5.0 count: 1 elasticsearchRef: name: "es-apm-sample" ```

What did you expect to see?

A green Elasticsearch cluster during the whole process.

What did you see instead? Under which circumstances?

The Elasticsearch cluster goes red during several seconds during the upgrade (from the point of view of ECK (kubectl get es)).

This is highlighted when the k8s cluster is slow (e.g.: kind on my laptop is slower than gke).

What is going on?

When the manifest with the new stack version is applied, the APM Server container and one Elasticsearch container are recreated in the new version.
The APM Server container is ready long before that of ES.
The APM Server tried to create new indices for the new version.
~~But the shards allocation has been disabled during the rolling upgrade of Elasticsearch.~~
~~We are therefore left with unallocated primary indices during the entire time that the ES instance starts.~~

/_cat/shards when the ES cluster health is reported as red:

apm-7.4.0-error-000001          0 p STARTED    0  283b 10.244.0.6 es-apm-sample-es-default-0
apm-7.4.0-error-000001          0 r STARTED    0  283b 10.244.0.5 es-apm-sample-es-default-1
apm-7.5.0-metric-000001         0 p STARTED    0  230b 10.244.0.6 es-apm-sample-es-default-0
apm-7.5.0-metric-000001         0 r UNASSIGNED                    
apm-7.4.0-transaction-000001    0 p STARTED    0  283b 10.244.0.5 es-apm-sample-es-default-1
apm-7.4.0-transaction-000001    0 r UNASSIGNED                    
apm-7.5.0-transaction-000001    0 p UNASSIGNED                    
apm-7.5.0-transaction-000001    0 r UNASSIGNED                    
apm-7.4.0-span-000001           0 p STARTED    0  283b 10.244.0.6 es-apm-sample-es-default-0
apm-7.4.0-span-000001           0 r UNASSIGNED                    
apm-7.5.0-error-000001          0 p STARTED    0  230b 10.244.0.6 es-apm-sample-es-default-0
apm-7.5.0-error-000001          0 r UNASSIGNED                    
apm-7.4.0-onboarding-2020.01.14 0 p STARTED    1 6.3kb 10.244.0.5 es-apm-sample-es-default-1
apm-7.4.0-onboarding-2020.01.14 0 r UNASSIGNED                    
apm-7.5.0-span-000001           0 p STARTED    0  230b 10.244.0.5 es-apm-sample-es-default-1
apm-7.5.0-span-000001           0 r UNASSIGNED                    
apm-7.5.0-onboarding-2020.01.14 0 p STARTED    1 6.2kb 10.244.0.5 es-apm-sample-es-default-1
apm-7.5.0-onboarding-2020.01.14 0 r UNASSIGNED                    
apm-7.4.0-metric-000001         0 p STARTED    0  283b 10.244.0.6 es-apm-sample-es-default-0
apm-7.4.0-metric-000001         0 r UNASSIGNED

Solution:

It's important to not upgrade at the same time an Elasticsearch cluster and an APM Server.

You have to upgrade the components of your Elastic Stack in the following order: see https://www.elastic.co/guide/en/elastic-stack/7.5/upgrading-elastic-stack.html#upgrade-order-elastic-stack.

The text was updated successfully, but these errors were encountered:

thbkrkr · 2020-01-14T16:15:43Z

Since it is very easy with ECK to upgrade multiple Elastic Stack components at once, perhaps we should at least document this.

anyasabo · 2020-01-14T16:50:24Z

Related #2353

david-kow · 2020-01-15T08:25:20Z

The APM Server tried to create new indices for the new version.
But the shards allocation has been disabled during the rolling upgrade of Elasticsearch.

I'm not sure if this is exactly correct as we do allow primaries to be allocated during upgrades. As @barkbay pointed to me, this might be caused by allocating a primary to a node that is about to be deleted.

It would seem to me we would have the same issue in general case of creating an index during upgrade - should we exclude pod from allocation before any delete? Right now we seem to do it only during downscales. I think this wouldn't prevent it on it's own (we would need to check for health again too), but it would shorten the window.

pebrc · 2020-02-24T10:38:02Z

Closing in favour of #2600

thbkrkr added the >non-issue label Jan 14, 2020

pebrc added >enhancement Enhancement of existing functionality and removed >non-issue labels Feb 3, 2020

pebrc mentioned this issue Feb 24, 2020

Enforce Elastic stack upgrade order #2600

Closed

pebrc closed this as completed Feb 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch must be upgraded before the APM Server #2426

Elasticsearch must be upgraded before the APM Server #2426

thbkrkr commented Jan 14, 2020 •

edited

Loading

thbkrkr commented Jan 14, 2020

anyasabo commented Jan 14, 2020

david-kow commented Jan 15, 2020

pebrc commented Feb 24, 2020

Elasticsearch must be upgraded before the APM Server #2426

Elasticsearch must be upgraded before the APM Server #2426

Comments

thbkrkr commented Jan 14, 2020 • edited Loading

thbkrkr commented Jan 14, 2020

anyasabo commented Jan 14, 2020

david-kow commented Jan 15, 2020

pebrc commented Feb 24, 2020

thbkrkr commented Jan 14, 2020 •

edited

Loading