Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Use existing subnets when creating/updating cluster #227

Merged
merged 6 commits into from
Jan 24, 2017

Conversation

iamsaso
Copy link
Contributor

@iamsaso iamsaso commented Jan 10, 2017

If you define subnetId you can reuse existing subnets.

subnets:
  - availabilityZone: us-east-1a
    instanceCIDR: "10.0.1.0/24"
    subnetId: "subnet-xxxxxxxx"
  - availabilityZone: us-east-1b
    instanceCIDR: "10.0.2.0/24"
    subnetId: "subnet-xxxxxxxx"
...

there is still some work needed to support node-pools, but I would love to hear opinions or any suggestions on this approach.

It's backward compatible and it uses CF Parameters to overload Subnets with existing subnet id's or references to new resources.

linked to #52

@redbaron
Copy link
Contributor

redbaron commented Jan 10, 2017

I am not terribly familiar with CF, but I can't see how would it work:
subnets must be specified by id subnet-xxxxx or via reference to a logical resource name created within same CF: { "Ref": "Subnet0" }. You solution references using logical resource name but not using "Ref" function, therefore it would be treated as subnet id and likely will be failing

@mumoshu
Copy link
Contributor

mumoshu commented Jan 10, 2017

Thanks for the PR @Sasso!

In addition to what @redbaron described, I'd recommend you to take a look into https://github.com/coreos/kube-aws/blob/37c242126fd56db938c3f170e51fddd7a6757207/nodepool/config/config.go#L228 to see how it could be achieved; not tested but maybe what you'd need is something like:

func (s Subnet) LogicalName() string {
  logicalNameOfTheSubnet := // something
  return logicalNameOfTheSubnet
}

func (s Subnet) Ref() string {
	if s.SubnetId != "" {
		return fmt.Sprintf("%q", c.SubnetId)
	}
	return fmt.Sprintf(`{"Ref" : "%s"}}`, s.LogicalName())
}

With calling the Ref() func from the template like:

"Properties": {
          "RouteTableId": "{{$.RouteTableID}}",
          "SubnetId": {{$subnet.Ref}}
        },

@iamsaso
Copy link
Contributor Author

iamsaso commented Jan 10, 2017

@redbaron, @mumoshu thanks for pointing me to the right direction! I pushed an update and I'm testing it now

@@ -299,6 +299,9 @@ type Cluster struct {
type Subnet struct {
AvailabilityZone string `yaml:"availabilityZone,omitempty"`
InstanceCIDR string `yaml:"instanceCIDR,omitempty"`
SubnetId string `yaml:"subnetId,omitempty"`
Create bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks redundant, it is always == (subnet.SubnetId != "") so might as well remove it

@@ -1032,6 +1044,9 @@ func (c *Cluster) ValidateExistingVPC(existingVPCCIDR string, existingSubnetCIDR
// Loop through all subnets
// Note: legacy instanceCIDR/availabilityZone stuff has already been marshalled into this format
for _, subnet := range c.Subnets {
if !subnet.Create {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation is still needed, instead of checking for IP ranges, check that subnet actually exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason validation is only done in create https://github.com/coreos/kube-aws/blob/master/cluster/cluster.go#L126

shouldn't we validate on update and render too?

@@ -169,8 +169,10 @@ controller:
# subnets:
# - availabilityZone: us-west-1a
# instanceCIDR: "10.0.0.0/24"
# subnetId: "subnet-xxxxxxxx" #optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally it should work with just subnetId specified

subnets:
 - subnetId: subnet-aaaaa
 - subnetId: subnet-zzzzz

does it work like that already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I did not pull any data from AWS, it uses {{$subnet.AvailabilityZone}} at some points so I used the static value. I'll look into pulling that from existing subnet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why AvailabilityZone is even needed if subnets are specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried getting data from AWS but I couldn't find a nice place to put that checks as i would also need to set availabilityZone and instanceCIDR to not break other stuff. So i ran into other validation failing when removing this out.

I don't mind setting availabilityZone and instanceCIDR and I will leave it as is for now. It would take me to much time to correctly do this. If anyone has an suggestion, easy fix or want to help whit this that would be awesome.

@@ -1037,10 +1027,10 @@
}
{{if $.ElasticFileSystemID}}
,
"{{$subnetLogicalName}}MountTarget": {
"{{$subnet.SubnetLogicalName}}MountTarget": {
Copy link
Contributor

@redbaron redbaron Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes a confusing resource name if we use existing subnets. Not sure it is worth fixing though :)
this wont work as subnetId has '-' symbol in it which is not alphanumeric and therefore can't be used in CF resource names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also need to move this out of the subnet create block

@codecov-io
Copy link

codecov-io commented Jan 10, 2017

Current coverage is 67.49% (diff: 40.74%)

No coverage report found for master at c538e61.

Powered by Codecov. Last update c538e61...bc7c8a1

@iamsaso
Copy link
Contributor Author

iamsaso commented Jan 10, 2017

I also added validation for update and export. It seems bad not validating data there.

@jeremyd
Copy link
Contributor

jeremyd commented Jan 12, 2017

This works now. Tested without subnets (regular mode) and with subnetIDs (pre-existing). All tests pass! Anything else we should do to get this accepted? Thanks!

@redbaron
Copy link
Contributor

Tested it too today, works as expected. I didn't test nodepools yet though

@mumoshu
Copy link
Contributor

mumoshu commented Jan 13, 2017

@Sasso Thanks for your effort!
I'd greatly appreciate it if you could add some test cases

@jeremyd
Copy link
Contributor

jeremyd commented Jan 13, 2017

We use node pools but with the static IDs and it is working. However, I went to test the scenario where users do not have pre-existing subnets and I'm having some problems. We will keep hacking on this.. Might be a few days.. fyi. Thanks!

@redbaron
Copy link
Contributor

@jeremyd users NOT having pre-existing subnets is the exact scenario kube-aws originally supports, what kind of problem you encounter? does this PR break it?

@jeremyd
Copy link
Contributor

jeremyd commented Jan 13, 2017 via email

@mumoshu
Copy link
Contributor

mumoshu commented Jan 14, 2017

@jeremyd If you meant about the E2E test suite, it isn't hard-coded with a CoreOS AWS account(I don't even have access to one 😝) but you bring your own one for to run it. FYI the usage is documented in https://github.com/coreos/kube-aws/tree/master/e2e

@jmcarp
Copy link
Contributor

jmcarp commented Jan 15, 2017

It looks like this PR adds a subset of the functionality in #195. @Sasso: could you take a look at #195 and send a sub-PR there if you see anything missing? I think we'll get this done faster if we join forces.

@redbaron
Copy link
Contributor

@jmcarp this is much simpler change than #195 , it is always easier to merge big changes in small pieces. If this one merged, then #195 can make use of it

@redbaron
Copy link
Contributor

@Sasso , branch with conflicts resolved https://github.com/HotelsDotCom/kube-aws/tree/pull-227-conflicts-1

@redbaron
Copy link
Contributor

@Sasso , also added commit to remove validation during cluster.Update() as it breaks kube-aws update command if subnets were created (== don't have subnetId specified)

@redbaron
Copy link
Contributor

@jeremyd, can you check if latest version of this PR still breaks your case?

@mumoshu
Copy link
Contributor

mumoshu commented Jan 23, 2017

@Sasso @jeremyd @redbaron It seems like my cluster fails in kube-aws validate when there's no subnetLogicalName: mysubnetname defined in cluster.yaml. Is my understanding correct?

$ kube-aws validate --s3-uri s3://mybucket/clusters/kubeawstest4
Validating UserData...
UserData is valid.

Validating stack template...
Error: invalid cloudformation stack: ValidationError: Template format error: Resource name '' is non alphanumeric.

The stack template exported via kube-aws --export is containing this snippet:

"": {
      "Properties": {
        "AvailabilityZone": "ap-northeast-1a",
        "CidrBlock": "10.0.0.0/24",
        "MapPublicIpOnLaunch": true,
        "Tags": [
          {
            "Key": "Name",
            "Value": "kubeawstest4-"
          },
          {
            "Key": "KubernetesCluster",
            "Value": "kubeawstest4"
          }
        ],
        "VpcId": {
          "Ref": "VPC"
        }
      },
      "Type": "AWS::EC2::Subnet"
    },

which indeed seem like the source of the error.

@mumoshu
Copy link
Contributor

mumoshu commented Jan 23, 2017

Would you mind if I add some code to default subnetLogicalName to something like subnet${index} which we had before?

,
"{{$subnetLogicalName}}RouteTableAssociation": {
"{{$subnet.SubnetLogicalName}}RouteTableAssociation": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but we probably need to create a route table association if and only if the one doesn't exist for the subnet or I guess a cluster creation fails because of a duplicated association in the route table?

{{if $.ElasticFileSystemID}}
,
"{{$subnetLogicalName}}MountTarget": {
"{{$subnet.SubnetLogicalName}}MountTarget": {
Copy link
Contributor

@mumoshu mumoshu Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: When the $subnet is an existing one, we probably need to create a mount target if and only if it doesn't exist for the pair of the subnet and the EFS.
See #208 (comment)

# - availabilityZone: us-west-1b
# instanceCIDR: "10.0.1.0/24"
# subnetId: "subnet-xxxxxxxx" #optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: subnetId will be renamed to id in the future according to #195 (comment)
cc @icereval

@mumoshu
Copy link
Contributor

mumoshu commented Jan 24, 2017

Defaulting of subnetLogicalName is addressed in b7adebb

Update: The branch has been rebased and the commit is now af7fad7

@mumoshu mumoshu merged commit 68d88b9 into kubernetes-retired:master Jan 24, 2017
@mumoshu
Copy link
Contributor

mumoshu commented Jan 24, 2017

Thanks to all for your efforts on this 🙇
I've merged this to get the ball rolling.
Please send another pull request(s) for further fixes and improvements if necessary.

@jeremyd
Copy link
Contributor

jeremyd commented Jan 27, 2017 via email

kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018
…ed#227)

This commit introduces new keys named `subnetId` and `subnetLogiaclName` into items under the top-level `subnets` in `cluster.yaml` to allow deployments to existing subnets while customizing logical names of subnets.

Notes on the current implementation:

* `subnetId` will be changed to just `id` in the future. See kubernetes-retired#227 (comment)
* Route table associations and EFS mount targets may fail while being created in in some cases. See kubernetes-retired#227 (comment) and kubernetes-retired#227 (comment)

Notes from original commits before squashing:

* Launch in existing subnets
* Validate update and export
* Don't try to validate config during update
* Move r53 schecks to create, update nodepool stack template
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants