Easier to configure, more tightly integrated node pools

This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```
HotelsDotCom · Feb 16, 2017 · 7c4f08f · 7c4f08f
1 parent 99ab3fe
commit 7c4f08f
Show file tree

Hide file tree

Showing 78 changed files with 3,999 additions and 3,464 deletions.
diff --git a/.gitignore b/.gitignore
@@ -2,8 +2,7 @@
 /bin
 /e2e/assets
 *~
-/config/templates.go
-/nodepool/config/templates.go
+/core/*/config/templates.go
 .idea/
 .envrc
 coverage.txt

diff --git a/Documentation/kubernetes-on-aws-node-pool.md b/Documentation/kubernetes-on-aws-node-pool.md
@@ -13,105 +13,73 @@ Node Pool allows you to bring up additional pools of worker nodes each with a se
 
 ## Deploying a Multi-AZ cluster with cluster-autoscaler support with Node Pools
 
-Edit the `cluster.yaml` file to decrease `workerCount`, which is meant to be number of worker nodes in the "main" cluster, down to zero:
+kube-aws creates a node pool in a single AZ by default.
+On top of that, you can add one or more node pool in an another AZ to achieve Multi-AZ.
 
-```yaml
-# `workerCount` should be set to zero explicitly
-workerCount: 0
-# And the below should be added before recreating the cluster
-worker:
-  autoScalingGroup:
-    minSize: 0
-    rollingUpdateMinInstancesInService: 0
-
-subnets:
-  - availabilityZone: us-west-1a
-    instanceCIDR: "10.0.0.0/24"
-```
-
-`kube-aws update` doesn't work when decreasing number of workers down to zero as of today.
-Therefore, don't update but recreate the main cluster to catch up changes made in `cluster.yaml`:
-
-```
-$ kube-aws destroy
-$ kube-aws up \
-  --s3-uri s3://<my-bucket>/<optional-prefix>
-```
-
-Create two node pools, each with a different subnet and an availability zone:
-
-```
-$ kube-aws node-pools init --node-pool-name first-pool-in-1a \
-  --availability-zone us-west-1a \
-  --key-name ${KUBE_AWS_KEY_NAME} \
-  --kms-key-arn ${KUBE_AWS_KMS_KEY_ARN}
-
-$ kube-aws node-pools init --node-pool-name second-pool-in-1b \
-  --availability-zone us-west-1b \
-  --key-name ${KUBE_AWS_KEY_NAME} \
-  --kms-key-arn ${KUBE_AWS_KMS_KEY_ARN}
-```
-
-Edit the `cluster.yaml` for the first zone:
-
-```
-$ $EDITOR node-pools/first-pool-in-1a/cluster.yaml
-```
+Assuming you already have a subnet and a node pool in the subnet:
 
 ```yaml
-workerCount: 1
 subnets:
-  - availabilityZone: us-west-1a
-    instanceCIDR: "10.0.1.0/24"
+- name: managedPublicSubnetIn1a
+  availabilityZone: us-west-1a
+  instanceCIDR: 10.0.0.0/24
+
+worker:
+  nodePools:
+    - name: pool1
+      subnets:
+      - name: managedPublicSubnetIn1a
 ```
 
-Edit the `cluster.yaml` for the second zone:
 
-```
-$ $EDITOR node-pools/second-pool-in-1b/cluster.yaml
-```
+Edit the `cluster.yaml` file to add the second node pool:
 
 ```yaml
-workerCount: 1
 subnets:
-  - availabilityZone: us-west-1b
-    instanceCIDR: "10.0.2.0/24"
-```
+- name: managedPublicSubnetIn1a
+  availabilityZone: us-west-1a
+  instanceCIDR: 10.0.0.0/24
+- name: managedPublicSubnetIn1c
+  availabilityZone: us-west-1c
+  instanceCIDR: 10.0.1.0/24
 
-Render the assets for the node pools including [cloud-init](https://github.com/coreos/coreos-cloudinit) cloud-config userdata and [AWS CloudFormation](https://aws.amazon.com/cloudformation/) template:
-
-```
-$ kube-aws node-pools render stack --node-pool-name first-pool-in-1a
-
-$ kube-aws node-pools render stack --node-pool-name second-pool-in-1b
+worker:
+  nodePools:
+    - name: pool1
+      subnets:
+      - name: managedPublicSubnetIn1a
+    - name: pool2
+      subnets:
+      - name: managedPublicSubnetIn1c
 ```
 
-Launch the node pools:
+Launch the secondary node pool by running `kube-aws update``:
 
 ```
-$ kube-aws node-pools up --node-pool-name first-pool-in-1a \
-  --s3-uri s3://<my-bucket>/<optional-prefix>
-
-$ kube-aws node-pools up --node-pool-name second-pool-in-1b \
+$ kube-aws update \
   --s3-uri s3://<my-bucket>/<optional-prefix>
 ```
 
-Deployment of cluster-autoscaler is currently out of scope of this documentation.
+Beware that you have to associate only 1 AZ to a node pool or cluster-autoscaler may end up failing to reliably add nodes on demand due to the fact
+that what cluster-autoscaler does is to increase/decrease the desired capacity hence it has no way to selectively add node(s) in a desired AZ.
+
+Also note that deployment of cluster-autoscaler is currently out of scope of this documentation.
 Please read [cluster-autoscaler's documentation](https://github.com/kubernetes/contrib/blob/master/cluster-autoscaler/cloudprovider/aws/README.md) for instructions on it.
 
 ## Customizing min/max size of the auto scaling group
 
-If you've chosen to power your worker nodes in a node pool with an auto scaling group, you can customize `MinSize`, `MaxSize`, `MinInstancesInService` in `cluster.yaml`:
+If you've chosen to power your worker nodes in a node pool with an auto scaling group, you can customize `MinSize`, `MaxSize`, `RollingUpdateMinInstancesInService` in `cluster.yaml`:
 
 Please read [the AWS documentation](http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html#aws-properties-as-group-prop) for more information on `MinSize`, `MaxSize`, `MinInstancesInService` for ASGs.
 
 ```
 worker:
-  # Auto Scaling Group definition for workers. If only `workerCount` is specified, min and max will be the set to that value and `rollingUpdateMinInstancesInService` will be one less.
-  autoScalingGroup:
-    minSize: 1
-    maxSize: 3
-    rollingUpdateMinInstancesInService: 2
+  nodePools:
+  - name: pool1
+    autoScalingGroup:
+      minSize: 1
+      maxSize: 3
+      rollingUpdateMinInstancesInService: 2
 ```
 
 See [the detailed comments in `cluster.yaml`](https://github.com/coreos/kube-aws/blob/master/nodepool/config/templates/cluster.yaml) for further information.
@@ -142,23 +110,27 @@ To add a node pool powered by Spot Fleet, edit node pool's `cluster.yaml`:
 
 ```yaml
 worker:
-  spotFleet:
-    targetCapacity: 3
+  nodePools:
+  - name: pool1
+    spotFleet:
+      targetCapacity: 3
 ```
 
 To customize your launch specifications to diversify your pool among instance types other than the defaults, edit `cluster.yaml`:
 
 ```yaml
 worker:
-  spotFleet:
-    targetCapacity: 5
-    launchSpecifications:
-    - weightedCapacity: 1
-      instanceType: t2.medium
-    - weightedCapacity: 2
-      instanceType: m3.large
-    - weightedCapacity: 2
-      instanceType: m4.large
+  nodePools:
+  - name: pool1
+    spotFleet:
+      targetCapacity: 5
+      launchSpecifications:
+      - weightedCapacity: 1
+        instanceType: t2.medium
+      - weightedCapacity: 2
+        instanceType: m3.large
+      - weightedCapacity: 2
+        instanceType: m4.large
 ```
 
 This configuration would normally result in Spot Fleet to bring up 3 instances to meet your target capacity:

diff --git a/build b/build
@@ -18,13 +18,14 @@ fi
 
 echo Building kube-aws ${VERSION}
 
-go generate ./config
-go generate ./nodepool/config
+go generate ./core/controlplane/config
+go generate ./core/nodepool/config
+go generate ./core/root/config
 
 if [[ ! "${BUILD_GOOS:-}" == "" ]];then
   export GOOS=$BUILD_GOOS
 fi
 if [[ ! "${BUILD_GOARCH:-}" == "" ]];then
   export GOARCH=$BUILD_GOARCH
 fi
-go build -ldflags "-X github.com/coreos/kube-aws/cluster.VERSION=${VERSION}" -a -tags netgo -installsuffix netgo -o "$OUTPUT_PATH" ./
+go build -ldflags "-X github.com/coreos/kube-aws/core/controlplane/cluster.VERSION=${VERSION}" -a -tags netgo -installsuffix netgo -o "$OUTPUT_PATH" ./
diff --git a/cfnstack/assets.go b/cfnstack/assets.go
@@ -0,0 +1,161 @@
+package cfnstack
+
+import (
+	"fmt"
+	"regexp"
+)
+
+type Assets interface {
+	Merge(Assets) Assets
+	AsMap() map[assetID]Asset
+	FindAssetByStackAndFileName(string, string) Asset
+}
+
+type assetsImpl struct {
+	underlying map[assetID]Asset
+}
+
+type assetID struct {
+	StackName string
+	Filename  string
+}
+
+func NewAssetID(stack string, file string) assetID {
+	return assetID{
+		StackName: stack,
+		Filename:  file,
+	}
+}
+
+func (a assetsImpl) Merge(other Assets) Assets {
+	merged := map[assetID]Asset{}
+
+	for k, v := range a.underlying {
+		merged[k] = v
+	}
+	for k, v := range other.AsMap() {
+		merged[k] = v
+	}
+
+	return assetsImpl{
+		underlying: merged,
+	}
+}
+
+func (a assetsImpl) AsMap() map[assetID]Asset {
+	return a.underlying
+}
+
+func (a assetsImpl) findAssetByID(id assetID) Asset {
+	asset, ok := a.underlying[id]
+	if !ok {
+		panic(fmt.Sprintf("[bug] failed to get the asset for the id \"%s\"", id))
+	}
+	return asset
+}
+
+func (a assetsImpl) FindAssetByStackAndFileName(stack string, file string) Asset {
+	return a.findAssetByID(NewAssetID(stack, file))
+}
+
+type AssetsBuilder interface {
+	Add(filename string, content string) AssetsBuilder
+	Build() Assets
+}
+
+type assetsBuilderImpl struct {
+	locProvider AssetLocationProvider
+	assets      map[assetID]Asset
+}
+
+func (b *assetsBuilderImpl) Add(filename string, content string) AssetsBuilder {
+	loc, err := b.locProvider.locationFor(filename)
+	if err != nil {
+		panic(err)
+	}
+	b.assets[loc.ID] = Asset{
+		AssetLocation: *loc,
+		Content:       content,
+	}
+	return b
+}
+
+func (b *assetsBuilderImpl) Build() Assets {
+	return assetsImpl{
+		underlying: b.assets,
+	}
+}
+
+func NewAssetsBuilder(stackName string, s3URI string) AssetsBuilder {
+	return &assetsBuilderImpl{
+		locProvider: AssetLocationProvider{
+			s3URI:     s3URI,
+			stackName: stackName,
+		},
+		assets: map[assetID]Asset{},
+	}
+}
+
+type Asset struct {
+	AssetLocation
+	Content string
+}
+
+type AssetLocationProvider struct {
+	s3URI     string
+	stackName string
+}
+
+type AssetLocation struct {
+	ID     assetID
+	Key    string
+	Bucket string
+	Path   string
+	URL    string
+}
+
+func newAssetLocationProvider(stackName string, s3URI string) AssetLocationProvider {
+	return AssetLocationProvider{
+		s3URI:     s3URI,
+		stackName: stackName,
+	}
+}
+
+func (p AssetLocationProvider) locationFor(filename string) (*AssetLocation, error) {
+	s3URI := p.s3URI
+
+	re := regexp.MustCompile("s3://(?P<bucket>[^/]+)/(?P<directory>.+[^/])/*$")
+	matches := re.FindStringSubmatch(s3URI)
+
+	path := fmt.Sprintf("%s/%s", p.stackName, filename)
+
+	var bucket string
+	var key string
+	if len(matches) == 3 {
+		bucket = matches[1]
+		directory := matches[2]
+
+		key = fmt.Sprintf("%s/%s", directory, path)
+	} else {
+		re := regexp.MustCompile("s3://(?P<bucket>[^/]+)/*$")
+		matches := re.FindStringSubmatch(s3URI)
+
+		if len(matches) == 2 {
+			bucket = matches[1]
+			key = path
+		} else {
+			return nil, fmt.Errorf("failed to parse s3 uri(=%s): The valid uri pattern for it is s3://mybucket/mydir or s3://mybucket", s3URI)
+		}
+	}
+
+	url := fmt.Sprintf("https://s3.amazonaws.com/%s/%s", bucket, key)
+	id := assetID{StackName: p.stackName, Filename: filename}
+
+	return &AssetLocation{
+		ID:     id,
+		Key:    key,
+		Bucket: bucket,
+		Path:   path,
+		URL:    url,
+	}, nil
+}