Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enable throughput & iops configs for managed node_groups #1584

Merged

Conversation

junaid-ali
Copy link
Contributor

@junaid-ali junaid-ali commented Sep 13, 2021

PR o'clock

Description

Fixes: #1567

This PR adds support for setting disk throughput and iops when create_launch_template is set to true for managed node groups.

Example usage:

module "eks" {
  ...

  node_groups = {
    nodes_with_gp3_disks = {
      create_launch_template = true

      disk_size       = 50
      disk_type       = "gp3"
      disk_throughput = 150
      disk_iops       = 3000

      ...

    }
  }

Checklist

@@ -27,7 +27,9 @@ The role ARN specified in `var.default_iam_role_arn` will be used by default. In
| disk\_encrypted | Whether the root disk will be encrypyted. Requires `create_launch_template` to be `true` and `disk_kms_key_id` to be set | bool | false |
| disk\_kms\_key\_id | KMS Key used to encrypt the root disk. Requires both `create_launch_template` and `disk_encrypted` to be `true` | string | "" |
| disk\_size | Workers' disk size | number | Provider default behavior |
| disk\_type | Workers' disk type. Require `create_launch_template` to be `true`| number | `gp3` |
| disk\_type | Workers' disk type. Require `create_launch_template` to be `true`| string | Provider default behavior |
Copy link
Contributor Author

@junaid-ali junaid-ali Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated since we don't seem to be setting it to gp3 as the default value, and the default by the provider is gp2
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/launch_template#volume_type

@junaid-ali
Copy link
Contributor Author

@daroga0002 could you please have a look at this PR?

@junaid-ali
Copy link
Contributor Author

junaid-ali commented Sep 13, 2021

Also, wondering if should we set a node_groups_defaults local variable like this one (that we are using to compute worker_group_defaults):

workers_group_defaults_defaults = {
name = "count.index" # Name of the worker group. Literal count.index will never be used but if name is not set, the count.index interpolation will be used.
tags = [] # A list of maps defining extra tags to be applied to the worker group autoscaling group and volumes.
ami_id = "" # AMI ID for the eks linux based workers. If none is provided, Terraform will search for the latest version of their EKS optimized worker AMI based on platform.
ami_id_windows = "" # AMI ID for the eks windows based workers. If none is provided, Terraform will search for the latest version of their EKS optimized worker AMI based on platform.
asg_desired_capacity = "1" # Desired worker capacity in the autoscaling group and changing its value will not affect the autoscaling group's desired capacity because the cluster-autoscaler manages up and down scaling of the nodes. Cluster-autoscaler add nodes when pods are in pending state and remove the nodes when they are not required by modifying the desired_capacity of the autoscaling group. Although an issue exists in which if the value of the asg_min_size is changed it modifies the value of asg_desired_capacity.
asg_max_size = "3" # Maximum worker capacity in the autoscaling group.
asg_min_size = "1" # Minimum worker capacity in the autoscaling group. NOTE: Change in this paramater will affect the asg_desired_capacity, like changing its value to 2 will change asg_desired_capacity value to 2 but bringing back it to 1 will not affect the asg_desired_capacity.
asg_force_delete = false # Enable forced deletion for the autoscaling group.

And setting default values in the root local.tf for all the vars that are expected by the node_group module:

node_groups_expanded = { for k, v in var.node_groups : k => merge(
{
desired_capacity = var.workers_group_defaults["asg_desired_capacity"]
iam_role_arn = var.default_iam_role_arn
instance_types = [var.workers_group_defaults["instance_type"]]
key_name = var.workers_group_defaults["key_name"]
launch_template_id = var.workers_group_defaults["launch_template_id"]
launch_template_version = var.workers_group_defaults["launch_template_version"]
set_instance_types_on_lt = false
max_capacity = var.workers_group_defaults["asg_max_size"]
min_capacity = var.workers_group_defaults["asg_min_size"]
subnets = var.workers_group_defaults["subnets"]
create_launch_template = false
kubelet_extra_args = var.workers_group_defaults["kubelet_extra_args"]
disk_size = var.workers_group_defaults["root_volume_size"]
disk_type = var.workers_group_defaults["root_volume_type"]
disk_encrypted = var.workers_group_defaults["root_encrypted"]
disk_kms_key_id = var.workers_group_defaults["root_kms_key_id"]
enable_monitoring = var.workers_group_defaults["enable_monitoring"]
eni_delete = var.workers_group_defaults["eni_delete"]
public_ip = var.workers_group_defaults["public_ip"]
pre_userdata = var.workers_group_defaults["pre_userdata"]
additional_security_group_ids = var.workers_group_defaults["additional_security_group_ids"]
taints = []
timeouts = var.workers_group_defaults["timeouts"]
update_default_version = true
ebs_optimized = null
metadata_http_endpoint = var.workers_group_defaults["metadata_http_endpoint"]
metadata_http_tokens = var.workers_group_defaults["metadata_http_tokens"]
metadata_http_put_response_hop_limit = var.workers_group_defaults["metadata_http_put_response_hop_limit"]

@junaid-ali
Copy link
Contributor Author

@daroga0002 @antonbabenko PTAL

@antonbabenko
Copy link
Member

Hi @junaid-ali !

This PR is ok but it touches some black-belt magic I still don't feel comfortable merging right away.

Let me do some other related updates (examples, docs, etc.) at the beginning of next week and then merge this one. Thank you for your understanding.

@junaid-ali
Copy link
Contributor Author

@antonbabenko are you able to have a look at this again?

Copy link
Contributor

@daroga0002 daroga0002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junaid-ali I have tested and everything working fine 🚀 , thank you for contribution 🥇

@antonbabenko lets merge this

modules/node_groups/locals.tf Show resolved Hide resolved
@antonbabenko antonbabenko merged commit b177806 into terraform-aws-modules:master Oct 7, 2021
lisfo4ka pushed a commit to lisfo4ka/terraform-aws-eks that referenced this pull request Oct 12, 2021
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

throughput & iops options for managed node_groups when create_launch_template=true
3 participants