Refactor downscales and add unit tests #1506

sebgl · 2019-08-07T12:39:56Z

This refactors the StatefulSets downscale code into multiple smaller
functions, easier to test.
It also adds unit tests for most of those functions, and generally covers
the entire downscale codebase with unit tests.

Relates #1287.

This refactors the StatefulSets downscale code into multiple smaller functions, easier to test. It also adds unit tests for most of those functions, and generally covers the entire downscale codebase with unit tests.

sebgl · 2019-08-07T13:09:47Z

This refactoring does not limit the number of master nodes we can remove at once.
I'll reintroduce it as part of #1281.
Edit: that wasn't part of the original code either. Will definitely introduce it with #1281.

operators/pkg/controller/elasticsearch/driver/downscale_test.go

david-kow

Looks good, few comments.

operators/pkg/controller/elasticsearch/driver/downscale.go

david-kow · 2019-08-08T10:48:51Z

operators/pkg/controller/elasticsearch/driver/downscale.go

+	case downscale.isReplicaDecrease():
+		// adjust the theoretical downscale to one we can safely perform
+		performable := calculatePerformableDownscale(ctx, downscale, allLeavingNodes)
+		if !performable.isReplicaDecrease() {


sebgl · 2019-08-08T11:02:51Z

Jenkins test this please

david-kow · 2019-08-08T11:55:49Z

Jenkins test this please

barkbay · 2019-08-08T12:12:21Z

Not sure if it is related to this PR but I think I managed to have a race condition:

I quickly changed the number of expected Pods from 5 -> 3 -> 5
Pods have not been deleted:

NAME                                               READY   AGE   CONTAINERS      IMAGES
statefulset.apps/elasticsearch-sample-es-default   5/5     33m   elasticsearch   docker.elastic.co/elasticsearch/elasticsearch:7.2.0
NAME                                    READY   STATUS    RESTARTS   AGE   IP          NODE                                                 NOMINATED NODE   READINESS GATES
pod/elasticsearch-sample-es-default-0   1/1     Running   0          27m   10.60.1.3   gke-michael-dev-cluster-default-pool-e885d7bd-0f29   <none>           <none>
pod/elasticsearch-sample-es-default-1   1/1     Running   0          27m   10.60.0.9   gke-michael-dev-cluster-default-pool-e4e8e5d9-tt13   <none>           <none>
pod/elasticsearch-sample-es-default-2   1/1     Running   0          27m   10.60.2.7   gke-michael-dev-cluster-default-pool-88a37176-xfx3   <none>           <none>
pod/elasticsearch-sample-es-default-3   1/1     Running   0          27m   10.60.0.8   gke-michael-dev-cluster-default-pool-e4e8e5d9-tt13   <none>           <none>
pod/elasticsearch-sample-es-default-4   1/1     Running   0          27m   10.60.1.2   gke-michael-dev-cluster-default-pool-e885d7bd-0f29   <none>           <none>
pod/kibana-sample-kb-d48d76cdd-8s74l    1/1     Running   0          19m   10.60.1.4   gke-michael-dev-cluster-default-pool-e885d7bd-0f29   <none>           <none>

But I have the two last nodes excluded:

{
  "persistent" : { },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "exclude" : {
            "_name" : "elasticsearch-sample-es-default-4,elasticsearch-sample-es-default-3"
          }
        }
      }
    }
  }
}

sebgl · 2019-08-08T12:34:35Z

@barkbay good catch!
This is because we return early, but should not:

// scheduleDataMigrations requests Elasticsearch to migrate data away from leavingNodes.
func scheduleDataMigrations(esClient esclient.Client, leavingNodes []string) error {
	if len(leavingNodes) == 0 {
		return nil
	}
	log.V(1).Info("Migrating data away from nodes", "nodes", leavingNodes)
	return migration.MigrateData(esClient, leavingNodes)
}

I'll fix this.
Thanks!

sebgl · 2019-08-08T12:56:26Z

Last commit should correctly reset shard allocation excludes, similar to how done before.
I created #1522 to optimize for not doing it at every single reconciliation.

I think there is a potential race where we'd clear allocation excludes before the pod is completely deleted (if it's still being terminated for example). I'd like to handle that with another issue: #1523.

barkbay

LGTM

operators/pkg/controller/elasticsearch/driver/downscale_test.go

operators/pkg/controller/elasticsearch/driver/log.go

operators/pkg/controller/elasticsearch/driver/downscale_test.go

anyasabo · 2019-08-09T00:31:07Z

LGTM, just one question about the fakes

david-kow

LGTM

Refactor downscales and add unit tests

c75104b

This refactors the StatefulSets downscale code into multiple smaller functions, easier to test. It also adds unit tests for most of those functions, and generally covers the entire downscale codebase with unit tests.

sebgl added >test Related to unit/integration/e2e tests >refactoring labels Aug 7, 2019

Add missing license header

d5b2456

Fix unit test

0e31895

thbkrkr reviewed Aug 8, 2019

View reviewed changes

operators/pkg/controller/elasticsearch/driver/downscale_test.go Outdated Show resolved Hide resolved

david-kow reviewed Aug 8, 2019

View reviewed changes

sebgl added 3 commits August 8, 2019 10:49

Fix license header location

d421f5c

Improve code readability

91e334d

More readability improvements

dd6bac9

david-kow reviewed Aug 8, 2019

View reviewed changes

Reset shards allocation excludes when migration is done

e6390aa

Fix unit test comment

ee114d7

barkbay approved these changes Aug 8, 2019

View reviewed changes

operators/pkg/controller/elasticsearch/driver/downscale_test.go Outdated Show resolved Hide resolved

operators/pkg/controller/elasticsearch/driver/downscale_test.go Outdated Show resolved Hide resolved

anyasabo reviewed Aug 8, 2019

View reviewed changes

operators/pkg/controller/elasticsearch/driver/log.go Show resolved Hide resolved

anyasabo reviewed Aug 9, 2019

View reviewed changes

operators/pkg/controller/elasticsearch/driver/downscale_test.go Show resolved Hide resolved

david-kow approved these changes Aug 9, 2019

View reviewed changes

sebgl added 2 commits August 9, 2019 08:57

Optimize imports

992957a

Improve unit test with proper data migration

17e32ec

sebgl merged commit c34da67 into elastic:master Aug 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor downscales and add unit tests #1506

Refactor downscales and add unit tests #1506

sebgl commented Aug 7, 2019 •

edited

Loading

sebgl commented Aug 7, 2019 •

edited

Loading

david-kow left a comment

david-kow Aug 8, 2019

sebgl commented Aug 8, 2019

david-kow commented Aug 8, 2019

barkbay commented Aug 8, 2019

sebgl commented Aug 8, 2019

sebgl commented Aug 8, 2019

barkbay left a comment

anyasabo commented Aug 9, 2019

david-kow left a comment

Refactor downscales and add unit tests #1506

Refactor downscales and add unit tests #1506

Conversation

sebgl commented Aug 7, 2019 • edited Loading

sebgl commented Aug 7, 2019 • edited Loading

david-kow left a comment

Choose a reason for hiding this comment

david-kow Aug 8, 2019

Choose a reason for hiding this comment

sebgl commented Aug 8, 2019

david-kow commented Aug 8, 2019

barkbay commented Aug 8, 2019

sebgl commented Aug 8, 2019

sebgl commented Aug 8, 2019

barkbay left a comment

Choose a reason for hiding this comment

anyasabo commented Aug 9, 2019

david-kow left a comment

Choose a reason for hiding this comment

sebgl commented Aug 7, 2019 •

edited

Loading

sebgl commented Aug 7, 2019 •

edited

Loading