[controller] Support multi instance placement add #275

notbdu · 2021-02-25T15:50:04Z

Modifies the placement client Add() API to accept multiple instances.

codecov · 2021-02-25T16:09:49Z

Codecov Report

Merging #275 (9447ac0) into master (7a6466e) will increase coverage by 0.00%.
The diff coverage is 63.82%.

@@           Coverage Diff           @@
##           master     #275   +/-   ##
=======================================
  Coverage   76.01%   76.02%           
=======================================
  Files          32       32           
  Lines        2381     2394   +13     
=======================================
+ Hits         1810     1820   +10     
- Misses        427      429    +2     
- Partials      144      145    +1

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a6466e...9447ac0. Read the comment docs.

… at once

schallert

LGTM. A few minor nits.

One general question: do our current "ready replica" checks guarantee that we'll add all pods at once, rather than accidentally adding one at a time?

For example, say we scale a set from 2 to 4 pods. At time T1, 1/2 new pods is up. Are our readiness checks structured in such a way that we won't immediate add the 3rd pod, and will wait until we can evaluate the two new ones at once?

schallert · 2021-03-02T21:47:57Z

pkg/controller/update_cluster.go

+			err := fmt.Errorf("error creating instance for pod %s", pod.Name)
+			c.logger.Error(err.Error())


Nit: take advantage of structured logging here (i.e. logger.Error("error creating instance for pod", zap.String("pod", ...

schallert · 2021-03-02T21:49:00Z

pkg/controller/update_cluster.go

+func (c *M3DBController) addPodsToPlacement(cluster *myspec.M3DBCluster, pods []*corev1.Pod) error {
+	var (
+		instances    = make([]*placementpb.Instance, 0, len(pods))
+		reasonBuf    = bytes.NewBufferString("adding pods to placement (")


This is a pretty different convention than how build up logs, and this isn't very perf-sensitive code.

I would vote for just adding to an array of strings, which are the names of the instances / pods, and then calling zap.Strings.

schallert · 2021-03-02T21:49:52Z

pkg/controller/update_cluster.go

 	if err != nil {
-		err := fmt.Errorf("error adding pod to placement: %s", pod.Name)
+		err := fmt.Errorf("error: %s", reason)


Same thing here re: taking the change to structure the below .Error log.

schallert · 2021-03-02T21:51:00Z

pkg/controller/update_cluster.go

 	"github.com/m3db/m3/src/cluster/placement"
 	dbns "github.com/m3db/m3/src/dbnode/generated/proto/namespace"
 	"github.com/m3db/m3/src/query/generated/proto/admin"

 	appsv1 "k8s.io/api/apps/v1"
 	corev1 "k8s.io/api/core/v1"
+	v1 "k8s.io/api/core/v1"


Nit: this ls already imported as corev1 on the line above.

notbdu · 2021-03-02T22:22:59Z

In practice, I observed the operator bringing up a pod at a time (waiting for each to be ready before proceeding) and then expanding the placement when all requested pods are up/ready/bootstrapped (w/ no shards). I observed this pattern testing in rc scaling from 2 -> 4 and 2 -> 6 nodes.

In the code it looks like we add a pod at a time here:

m3db-operator/pkg/controller/controller.go

Lines 710 to 716 in 7a6466e

    
           var newCount int32 
        
           if current < desired { 
        
           	newCount = current + 1 
        
           } else { 
        
           	newCount = current - 1 
        
           } 
        
           setLogger.Info("resizing set, desired != current", zap.Int32("newSize", newCount))

And then when the number of desired == current pods we expand the placement here (placement shouldn't be expanded earlier than this point):

m3db-operator/pkg/controller/controller.go

Lines 688 to 701 in 7a6466e

    
           if desired == current { 
        
           	// If the set is at its desired size, and all pods in the set are in the 
        
           	// placement, there's nothing we need to do for this set. 
        
           	if current == inPlacement { 
        
           		continue 
        
           	} 
        
           	// If the set is at its desired size but there's pods in the set that are 
        
           	// absent from the placement, add pods to placement. 
        
           	if inPlacement < current { 
        
           		setLogger.Info("expanding placement for set") 
        
           		return c.expandPlacementForSet(cluster, set, group, placement) 
        
           	} 
        
           }

Can observe this in the bootstrap metrics as well (2 -> 4 and then 2 -> 6):

* master: Backwards compatibility when using the original update annoation with an OnDelete update strategy (#284) Add support for parallel node updates within a statefulset (#283) Support namespace ExtendedOptions in cluster spec (#282) [controller] Support multi instance placement add (#275) [gomod] Update M3DB dependency (#277) [cmd] Fix instrument package name (#280) # Conflicts: # pkg/k8sops/annotations/annotations.go

notbdu requested a review from schallert February 25, 2021 15:50

notbdu force-pushed the bdu/placement-multi-add branch 2 times, most recently from ee06603 to 05f3c48 Compare February 25, 2021 15:52

notbdu force-pushed the bdu/placement-multi-add branch from d6e5e70 to aa4aa9c Compare March 2, 2021 16:12

notbdu added 2 commits March 2, 2021 11:12

[controller] add support for adding multiple instances to a placement…

cb81012

… at once

[tests] fix tests

75ae345

notbdu force-pushed the bdu/placement-multi-add branch from aa4aa9c to 75ae345 Compare March 2, 2021 16:13

fix lint

589eca4

schallert approved these changes Mar 2, 2021

View reviewed changes

Address PR feedback.

9447ac0

notbdu merged commit 66671e8 into master Mar 2, 2021

notbdu deleted the bdu/placement-multi-add branch March 2, 2021 23:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[controller] Support multi instance placement add #275

[controller] Support multi instance placement add #275

notbdu commented Feb 25, 2021

codecov bot commented Feb 25, 2021 •

edited

Loading

schallert left a comment

schallert Mar 2, 2021

schallert Mar 2, 2021

schallert Mar 2, 2021

schallert Mar 2, 2021

notbdu commented Mar 2, 2021 •

edited

Loading

		err := fmt.Errorf("error creating instance for pod %s", pod.Name)
		c.logger.Error(err.Error())

[controller] Support multi instance placement add #275

[controller] Support multi instance placement add #275

Conversation

notbdu commented Feb 25, 2021

codecov bot commented Feb 25, 2021 • edited Loading

Codecov Report

schallert left a comment

Choose a reason for hiding this comment

schallert Mar 2, 2021

Choose a reason for hiding this comment

schallert Mar 2, 2021

Choose a reason for hiding this comment

schallert Mar 2, 2021

Choose a reason for hiding this comment

schallert Mar 2, 2021

Choose a reason for hiding this comment

notbdu commented Mar 2, 2021 • edited Loading

codecov bot commented Feb 25, 2021 •

edited

Loading

notbdu commented Mar 2, 2021 •

edited

Loading