-
Notifications
You must be signed in to change notification settings - Fork 466
kube-aws: Support Multi-AZ workers on AWS #439
Conversation
21ba2bd
to
b6f5bba
Compare
Just gave it a try and worked perfectly 👍 |
i, | ||
) | ||
} | ||
if i == 0 && !instancesNet.Contains(controllerIPAddr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the convention of first subnet must include controllerIP, we should document it (maybe in config).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment!
I agree that we are better documenting it for now.
Also let me suggest that we should remove it when we introduce an auto-scaling group to manage controller(s) and an ELB pointing to it.
This "convention" remains here because we currently have a singleton controller(which has controllerIP
pointing to it) and it is placed in the first subnet "for now"(choosing one of subnets for the singleton controller doesn't make sense but I have proceeded with this convention for now).
If we had an auto-scaling group managing controller(s), we would configure the group to spread EC2 instances evenly over multiple subnets(as I had done for workers in this PR).
Then, we don't need to choose first/second/last subnet to place the singleton controller(because AWS auto-scaling does the job for us) or to specify/validate controllerIP(because having multiple controllers, we will point the master from workers via DNS name of an load balancer or Route53 DNS name pointing to it).
Does this make sense to you? @aaronlevy @colhom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, longer-term plan sounds good to me.
My concern with documenting this in the config.yaml
is that it's not immediately clear what is required of the networking configs (e.g. first subnet must contain controllerIP).
However, I just checked the existing config.yaml, and looks like we're not being explicit about the any of the networking expectations - so this is probably something I can raise as a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same feeling about your concern .
Also, at this point, a separate issue sounds O.K. for me to keep us proceeding incrementally.
In general the code looks good. I'm not familiar enough with the upstream "ubernetes-lite" proposal and how well this would work with that (deferring to @colhom ) |
@mumoshu I've read through the |
@@ -108,6 +118,12 @@ type Cluster struct { | |||
RecordSetTTL int `yaml:"recordSetTTL"` | |||
HostedZone string `yaml:"hostedZone"` | |||
StackTags map[string]string `yaml:"stackTags"` | |||
Subnets []Subnet `yaml:"subnets"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field (and it's subfields) need to be added to pkg/config/templates/cluster.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added documentation in the commit 7b58131
Would you mind reviewing it? 🙇
@mumoshu this is a breaking change, as If we were to go the route of breaking the config api, I would suggest that we add \cc @aaronlevy |
@colhom Thanks for your comments! Excuse me if i'm missing the point. I've tried to make this not to be a breaking change by making subnets settings to be completely optional (see here and there). What I intended is that, even after this change, |
@colhom Let me add something to my previous comment. Maybe not about config api, but I did make the singleton-controller-part of the template refer to not the top-level I guess I can make singleton-controller-part of the template refer to the top-level 🙇 |
@@ -74,6 +75,15 @@ func ClusterFromBytes(data []byte) (*Cluster, error) { | |||
// as it will with RecordSets | |||
c.HostedZone = WithTrailingDot(c.HostedZone) | |||
|
|||
if len(c.Subnets) == 0 { | |||
c.Subnets = []Subnet{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely appreciate the effort to keep backwards compatibility here. Given that this is clearly documented, we can later think about deprecating instanceCIDR
and availabilityZone
.
@mumoshu I agree with your approach to keep backwards comparability. I'd like to see a few more things.
|
) | ||
|
||
for i, subnet := range cfg.Subnets { | ||
if subnet.AvailabilityZone == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also be checking that non of the subnets overlap with eachother.
…n did you want, single AZ or multi AZ? ref coreos#439 (comment)
…subnets to automatically find and chooose an appropriate subnet for the specified controllerIP ref 3rd comment in coreos#439 (comment)
… in templates/cluster.yaml ref 2nd comment in coreos#439 (comment)
@colhom Thanks! I have added some commits according to your very agreeable comments. |
cfg.ControllerIP, | ||
) | ||
|
||
if len(cfg.Subnets) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this case unreachable? here you ensure that subnets always has at least one element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry, scratch that ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: If we move this block of logic before c.valid()
, can't we do away with this conditional validation behavior and only validate the cfg.Subnets
array, as the backwards compatibility measure would have already been taken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colhom Thanks for your question!
I guess doing that(overwriting InstanceCIDR
with a default value and determining implicit Subnets
from the top-level InstanceCIDR
and AvailabilityZone
) before validation prevents us doing the validation: Did the user specified both top-level InstanceCIDR/AvailabilityZone and Subnets?
defined here with the test
It's a bit confusing but, thinking more generally, I took c.valid()
as a validation of an user input. To structurally validate the user input, I thought it would be better not to structurally touch it before validation.
Does this make sense to you?
Btw, the order of valid()
call and backward-compatiblity logic, defaults are not very intuitive in my code and I wish I could somehow make it more intuitive, too 👍
Just not coming up with a nicer idea though 😢
(I guess we can do better if we made the user-input and the config to be separate go structs. For me, it seemed too much for this PR though)
@mumoshu thanks for the changes, they look good. I have a few minor comments I've left. I feel like the code could be a bit cleaner if |
if err != nil { | ||
return nil, fmt.Errorf("invalid instanceCIDR: %v", err) | ||
} | ||
if instanceCIDR.Contains(controllerIPAddr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that my prior comment about "still checking controller is in first subnet" has already been fixed! Sorry, I blame github for being weird ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch 👍 I've realized it while testing and force-pushed a squashed commit before your review 😅
@colhom Thanks for your review! I have replied to you in #439 (comment) |
controllerSubnetFound = true | ||
} else if !controllerSubnetFound && i == lastSubnetIndex { | ||
return nil, fmt.Errorf("Fail-fast occurred possibly because of a bug: ControllerSubnetIndex couldn't be determined for subnets (%v) and controllerIP (%v)", stackConfig.Subnets, stackConfig.ControllerIP) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this loop correctly, I think it could be made a bit more simple by dropiing lastSubnetIndex
and moving this final check
outside of the loop:
controllerSubnetFound := false
for i, subnet := range stackConfig.Subnets {
_, instanceCIDR, err := net.ParseCIDR(subnet.InstanceCIDR)
if err != nil {
return nil, fmt.Errorf("invalid instanceCIDR: %v", err)
}
if instanceCIDR.Contains(controllerIPAddr) {
stackConfig.ControllerSubnetIndex = i
controllerSubnetFound = true
}
}
if !controllerSubnetFound {
return nil, fmt.Errorf("Fail-fast occurred possibly because of a bug: ControllerSubnetIndex couldn't be determined for subnets (%v) and controllerIP (%v)", stackConfig.Subnets, stackConfig.ControllerIP)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Addressed in ea9f366
@mumoshu sorry for the delay- this looks good. Rebase and squash the commits, and I'll run it through tests. |
One step forward to achieve high-availability throught the cluster. This change allows you to specify multiple subnets in cluster.yaml to make workers' ASG spread instances over those subnets. Differentiating each subnet's availability zone results in H/A of workers. Beware that this change itself does nothing with H/A of masters. Possibly relates to coreos#147, coreos#100
@colhom Thanks, I have just rebased/squashed this 👍 |
Thanks @mumoshu ! |
One step forward to achieve high-availability throughout the cluster.
This change allows you to specify multiple subnets in cluster.yaml to make workers' ASG spread instances over those subnets. Differentiating each subnet's availability zone results in H/A of workers.
Beware that this change itself does nothing with H/A of masters.
Possibly relates to #147, #100
How I have tested this
I have tested this running
./build && rm -rf cluster.yaml userdata/ credentials/ && bin/kube-aws init *several options* && bin/kube-aws render && bin/kube-aws up
with with/without the newly addedsubnets
section incluster.yaml
.For example, given a
cluster.yaml
:After modifying worker ASG's desired capacity to 2 result in 2 worker nodes:
spreading across availability zone(this time they are
ap-northeast-1a
andap-northeast-1c
)