Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support for capacity_type and list of instance_types in node_groups #1129

Conversation

jonathan-mothership
Copy link
Contributor

@jonathan-mothership jonathan-mothership commented Dec 2, 2020

PR o'clock

Description

  • Adds support for specifying the node group capacity_type to leverage the new AWS provider feature for spot instances in managed node groups.
  • Allows for specifying a list of instance types in the node_group map, defaulting to a list with the instance type in workers_group_defaults.

These features are available in the latest v3 release of the AWS provider, so the required_providers version for aws was pushed to 3.19.0.

With these changes, I was able to switch my existing node groups to using spot with a custom launch configuration so we can pass our user_data!

Checklist

@jonathan-mothership jonathan-mothership changed the title add support for specifying capacity_type in node_groups feat: add support for specifying capacity_type in node_groups Dec 2, 2020
@jonathan-mothership jonathan-mothership changed the title feat: add support for specifying capacity_type in node_groups feat: support for capacity_type and list of instance_types in node_groups Dec 2, 2020
@kpucynski
Copy link

I need this ;) please merge ;)

@antiqe
Copy link

antiqe commented Dec 9, 2020

Thanks for the PR @jonathan-mothership, we waiting this capabilities since multiple month, strongly appreciate. I can't wait the merge 💯

@GolubevV
Copy link
Contributor

Please merge until some known incompatibilities, currently has to fork the fork to use spot managed node groups

@jonathan-mothership
Copy link
Contributor Author

Could we get a review of this? @max-rocket-internet @barryib @mjwilkerson-strateos

@barryib
Copy link
Member

barryib commented Dec 10, 2020

Sorry for the delay. I'll review this in the next couple of days.

@joelfogue
Copy link

joelfogue commented Dec 22, 2020

Any updates on this merge @barryib; thanks by the way for your hardwork; you as well @jonathan-mothership!!

@barryib
Copy link
Member

barryib commented Dec 22, 2020

I'm on it.

@barryib barryib self-assigned this Dec 22, 2020
@barryib
Copy link
Member

barryib commented Dec 23, 2020

@jonathan-mothership thanks for working on this. Can you please update your branch and resolve conflicts. I labeled this PR as a breaking change, because this will force to rebuild existing MNG.

@jonathan-mothership
Copy link
Contributor Author

@barryib all up-to-date!

@stevehipwell
Copy link
Contributor

@barryib do you have a ballpark idea of when this is likely to be released?

@jack1902
Copy link

Been using this for a week now and it certainly works :) (just merged it in and gave it a list of instance_types). Looking forward to it being merged to master but simply merge this into your own forks for now and you can save some $$$

@mrantipyrine
Copy link

@barryib Any chance this will be merged in the near future? Thanks!

@kencieszykowski
Copy link

kencieszykowski commented Jan 13, 2021

@barryib - Any chance we can get a merge this week? Looking to deploy a fleet of clusters with some Spot node groups. Thank you!

@mc-meta
Copy link

mc-meta commented Jan 28, 2021

Any chance to speedup review of this PR and have it eventually merged?

Thanks for support!

@barryib barryib merged commit 8978997 into terraform-aws-modules:master Jan 28, 2021
@barryib
Copy link
Member

barryib commented Jan 28, 2021

Thanks @jonathan-mothership for your contribution and for your patience.

@jonathan-mothership jonathan-mothership deleted the nodegroups/capacity_type branch January 28, 2021 22:20
@jonathan-mothership jonathan-mothership restored the nodegroups/capacity_type branch January 28, 2021 22:28
@anon892
Copy link

anon892 commented Jan 28, 2021

Thanks @jonathan-mothership. @barryib when will it be released with new tag?

@barryib
Copy link
Member

barryib commented Jan 28, 2021

Just shipped v14.0.0.

@philicious
Copy link
Contributor

@barryib @jonathan-mothership this breaks support for LTs that got added by me with #997 and v13.1.0 !

when using an LT, you cannot set instance-type on the MNG itself !

  - instance_types  = each.value["launch_template_id"] != null ? [] : [each.value["instance_type"]]
  + instance_types  = lookup(each.value, "instance_types", null)

when using v14.0, terraform will try to update all MNG, which it shouldnt in case of LT. furthermore, it will fail with

InvalidRequestException: Cannot specify instance types in launch template and API request

@jonathan-mothership
Copy link
Contributor Author

jonathan-mothership commented Feb 2, 2021

Linking this comment to your issue (#1211) for traceability.

@barryib @jonathan-mothership this breaks support for LTs that got added by me with #997 and v13.1.0 !

This was understood to be a breaking change. Since the underlying API supports specifying multiple types, the changes here to accepting multiple types warranted the key name change. Regardless, at the time of release of the changes to the provider, the capacity_type field was already causing a rebuild of the MNGs so the key name change seemed less of an issue.

when using an LT, you cannot set instance-type on the MNG itself !

This is partially true, you can specify one instance type on either the LC or the MNG, but not both, however multiple types is only supported on the MNG API call. See the docs quote below for the official language.

  - instance_types  = each.value["launch_template_id"] != null ? [] : [each.value["instance_type"]]
  + instance_types  = lookup(each.value, "instance_types", null)

This change shouldn't affect your ability to use launch templates with MNGs.

when using v14.0, terraform will try to update all MNG, which it shouldnt in case of LT. furthermore, it will fail with

InvalidRequestException: Cannot specify instance types in launch template and API request

I suspect this is because you are specifying an instance type in your launch template as well as on the MNGs. From the docs:

If you deploy a node group using a launch template, then you can specify zero or one Instance type under Launch template contents in a launch template or you can specify 0-20 instance types for Instance types on the Set compute and scaling configuration page in the console, or using other tools that use the Amazon EKS API. If you specify an instance type in a launch template, and use that launch template to deploy your node group, then you can't specify any instance types in the console or using other tools that use the Amazon EKS API. If you don't specify an instance type in a launch template, in the console, or using other tools that use the Amazon EKS API, then the t3.medium instance type is used, by default. If your node group is using the Spot capacity type, then we recommend specifying multiple instance types using the console. For more information, see Managed node group capacity types.

FWIW I sympathize with those having to recreate MNGs, it's not fun and sometimes just isn't an option.

@philicious
Copy link
Contributor

Regardless, at the time of release of the changes to the provider, the capacity_type field was already causing a rebuild of the MNGs so the key name change seemed less of an issue.

the capacity_type introduction and resulting recreation of MNG might be unfortunate but acceptable. however that is not a breaking issue.

This is partially true, you can specify one instance type on either the LC or the MNG, but not both,
This change shouldn't affect your ability to use launch templates with MNGs.

exactly! but your change will always set the instance_type on the MNG. either from user-supplied vars or the eks modules default will be taken. thereby leading to a broken LT-support as API will always error

  - instance_types  = each.value["launch_template_id"] != null ? [] : [each.value["instance_type"]]
  + instance_types  = lookup(each.value, "instance_types", null)

I suspect this is because you are specifying an instance type in your launch template as well as on the MNGs.

no, the eks modules default is taken https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/local.tf#L42

ArchiFleKs pushed a commit to ArchiFleKs/terraform-aws-eks that referenced this pull request Apr 16, 2021
…ules#1129)

BREAKING CHANGES: To add add SPOT support for MNG, the `instance_type` is now a list and renamed as `instance_types`. This will probably rebuild existing Managed Node Groups.
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.