Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating ECS service capacity provider strategy replaces resource #11351

Closed
kollektiv opened this issue Dec 18, 2019 · 16 comments · Fixed by #20707
Closed

Updating ECS service capacity provider strategy replaces resource #11351

kollektiv opened this issue Dec 18, 2019 · 16 comments · Fixed by #20707
Assignees
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.
Milestone

Comments

@kollektiv
Copy link

kollektiv commented Dec 18, 2019

When attempting to updating an ecs_service.capacity_provider_strategy the resource is replaced. It should be possible to update the capacity provider strategy without replacing the ECS service with UpdateService https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_UpdateService.html

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Relates #11409

Terraform Version

$ terraform -v
Terraform v0.12.18
+ provider.aws v2.42.0

Affected Resource(s)

  • aws_ecs_service

Expected Behavior

ECS service updates without replacing the resources

Actual Behavior

ECS service is recreated

Steps to Reproduce

  1. Create a aws_ecs_service resource with a capacity_provider_strategy
  2. terraform apply
  3. Change capacity_provider_strategy
  4. terraform apply

References

https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_UpdateService.html

@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Dec 18, 2019
@ghost ghost added the service/ecs Issues and PRs that pertain to the ecs service. label Dec 18, 2019
@danieladams456
Copy link
Contributor

Yes, it must force a new deployment but not have to destroy/re-create the service:

aws ecs update-service --cluster democluster --service demoservice --capacity-provider-strategy capacityProvider=FARGATE,weight=0,base=2 capacityProvider=FARGATE_SPOT,weight=100 --force-new-deployment

@piotrb
Copy link

piotrb commented Feb 18, 2020

@danieladams456 this works well enough if you're using FARGATE .. but there doesn't seem to be a pre-defined name for the "default" EC2 provider .. so you can't reset the service to regular EC2 mode without recreating it .. this might be a limitation of AWS right now .. I might be wrong here :)

@danieladams456
Copy link
Contributor

@piotrb definitely works (just POC phase for now) but even in Fargate it destroys and recreates the service. That will cause downtime as opposed to just creating another deployment.

Terraform will perform the following actions:

  # aws_ecs_service.service must be replaced
-/+ resource "aws_ecs_service" "service" {
        cluster                            = "demo-service"
        deployment_maximum_percent         = 200
        deployment_minimum_healthy_percent = 100
        desired_count                      = 1
        enable_ecs_managed_tags            = true
        health_check_grace_period_seconds  = 60
      ~ iam_role                           = "aws-service-role" -> (known after apply)
      ~ id                                 = "arn:aws:ecs:us-east-1:XXXXXXXXXX:service/demo-service/demo-service" -> (known after apply)
      + launch_type                        = (known after apply)
        name                               = "demo-service"
        platform_version                   = "LATEST"
        propagate_tags                     = "SERVICE"
        scheduling_strategy                = "REPLICA"
      - tags                               = {} -> null
        task_definition                    = "arn:aws:ecs:us-east-1:XXXXXXXXXX:task-definition/demo-service:10"

      - capacity_provider_strategy { # forces replacement
          - base              = 0 -> null
          - capacity_provider = "FARGATE_SPOT" -> null
          - weight            = 100 -> null
        }
      - capacity_provider_strategy { # forces replacement
          - base              = 2 -> null
          - capacity_provider = "FARGATE" -> null
          - weight            = 0 -> null
        }
      + capacity_provider_strategy { # forces replacement
          + base              = 2
          + capacity_provider = "FARGATE"
          + weight            = 50
        }
      + capacity_provider_strategy { # forces replacement
          + capacity_provider = "FARGATE_SPOT"
          + weight            = 50
        }

      - deployment_controller {
          - type = "ECS" -> null
        }

        load_balancer {
            container_name   = "demo-service"
            container_port   = 8080
            target_group_arn = "arn:aws:elasticloadbalancing:us-east-1:XXXXXXXXXX:targetgroup/demo-service/c44b99f42280a4a7"
        }

        network_configuration {
            assign_public_ip = false
            security_groups  = [
                "sg-09d58bb97552d3d20",
            ]
            subnets          = [
                "subnet-52c7fd37",
                "subnet-bf891af7",
                "subnet-c7c61b9d",
            ]
        }

      + placement_strategy {
          + field = (known after apply)
          + type  = (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

@piotrb
Copy link

piotrb commented Feb 18, 2020

Yep for sure, just saying it doesn't work in all cases, so the fix isn't to just remove the flag from the field .. it has to be a bit more than that

@dinvlad
Copy link

dinvlad commented May 12, 2020

We also observe flip-flopping of capacity_provider_strategy on every deploy, without any changes to capacity_provider_strategy. I.e. without any changes to our template, on every deploy it shows

      - capacity_provider_strategy { # forces replacement
          - base              = 0 -> null
          - capacity_provider = "FARGATE_SPOT" -> null
          - weight            = 1 -> null
        }

EDIT: More specifically, the flip-flopping happens when we set default_capacity_provider_strategy for aws_ecs_cluster, and then don't set it for aws_ecs_service. The workaround is to remove default_capacity_provider_strategy from the cluster and add it to the service.

@threeguys
Copy link

Just dog piling here, I can confirm I had the same issue. Using default_capacity_provider inside an aws_ecs_cluster block caused my service to be replaced on every run (with no changes). I had the same output:

      - capacity_provider_strategy { # forces replacement
          - base              = 0 -> null
          - capacity_provider = "my-awesome-cap-provider" -> null
          - weight            = 1 -> null
        }

I had no capacity provider set for the service and used the workaround mentioned by @dinvlad . Removing default_capacity_provider from aws_ecs_cluster and adding capacity_provider_strategy to my aws_ecs_service block caused it to start working properly and not try to recreate the service every time.

I'm super new to terraform so it could be something about my configuration. I'll be happy to post more details if needed.

@bflad bflad added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Jul 7, 2020
@alex-bes
Copy link

alex-bes commented Jul 9, 2020

Terraform v0.12.24

  • provider.aws v2.68.0

We also observe flip-flopping of capacity_provider_strategy on every deploy, without any changes to capacity_provider_strategy. I.e. without any changes to our template:

This seems to be happening for the services created after default_capacity_provider has been specified for the cluster. e.g. if you create cluster with no default_capacity_provider specified, create service and then return back and update cluster to include default_capacity_provider subsequent plans are not showing perpetual diff for the service. At least, this is how I worked the issue around.

Hope this helps.

@vladimirtiukhtin
Copy link

I've managed to workaround this using lifecycle

  lifecycle {
    ignore_changes = [
      capacity_provider_strategy
    ]
  }

emileswarts added a commit to ministryofjustice/staff-device-dns-dhcp-infrastructure that referenced this issue Sep 3, 2020
If it is defined on the cluster, it will cause drift and the service
will be recreated each time:
hashicorp/terraform-provider-aws#11351
@m13t
Copy link

m13t commented Nov 20, 2020

This is causing us some issues given that we can't move from one capacity provider to another (as we recycle ASGs) without Terraform reporting that it's going to destroy the service. This isn't required when using the API unless the service didn't start life without a capacity provider.

To me, this seems like a fundamental breaking issue if one cannot recycle the underlying capacity provider without having some kind of service outage.

@m13t
Copy link

m13t commented Nov 20, 2020

Looks like the resource needs a customize diff function adding to support the dynamic nature of whether a service can be updated in place or requires replacing, depending on whether the service has a capacity provider already or not.

@bogdanbrindusan
Copy link

@m13t just facing the very same issue. any chance we can get this merged? 🙏

@yyoda
Copy link

yyoda commented Jun 29, 2021

#16402 seems to have been closed due to conflicts with #16942, but I don't think there are any conflicts.

  • 16402: The purpose is to fix a bug in aws_ecs_service.
  • 16942: The purpose is to enhance aws_ecs_capacity_provider.

I would like #16402 to be reopened if possible.
I tried v3.47.0 which includes the fix for #16942, but it did not work as expected.

@richardgavel
Copy link

I'm trying to decide if I could bypass this issue by using ignore_changes on the capacity_provider_strategy and using null_resource with aws cli call. It might result in a double deployment (once for the service TF resource and once for the aws cli call, but it might still work.

@dinvlad
Copy link

dinvlad commented Nov 25, 2021

Many thanks, @YakDriver !

@github-actions
Copy link

github-actions bot commented Dec 1, 2021

This functionality has been released in v3.67.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.
Projects
None yet