-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provider produced inconsistent result after apply ECS Fargate task #20452
Comments
Experiencing this with the latest AWS module This is the service in question: resource "aws_ecs_service" "site" {
name = "site${var.branch}"
cluster = data.aws_ecs_cluster.main.id
task_definition = "${aws_ecs_task_definition.site.id}:${aws_ecs_task_definition.site.revision}"
desired_count = 1
health_check_grace_period_seconds = 60
enable_execute_command = true
platform_version = "1.4.0"
capacity_provider_strategy {
capacity_provider = "FARGATE_SPOT"
weight = 100
base = 1
}
network_configuration {
security_groups = [data.aws_security_group.essential.id]
subnets = data.aws_subnet.main_subnet.*.id
assign_public_ip = true
}
load_balancer {
target_group_arn = aws_alb_target_group.site.arn
container_name = "site${var.branch}"
container_port = 80
}
} |
Dug through our CI logs and found this which triggered the resource to be created but not registered in the state, deleting the service manually fixes it.
|
Same issue during an apply of a terraform config that only specifies an ECS service and the apply was creating it. Should have been on provider version 3.63.0 but I'm not sure as this was called from inside an ECS container (yeah, don't ask ;-)) which got re-rolled in the meantime. |
We've just hit this on a non-fargate ECS service.
Removing the service and re-running terraform fixed it. The change that was introduced was |
Some more context on this, I've enabled the TF_LOG and found that the problem service is returning a status of INACTIVE so terraform sees this as an error and taints it which causes an error. I'm guessing here but either 1. The service is failing to create properly or 2. Terraform isn't waiting for the service to be destroyed before creating it (in our case we taint the service before running apply). Here's the except of the log:
|
We are seeing this error on a regular basis. We create approximately ~150 ECS services each morning across a number of test environments and destroy them all in the evening. Approximately once a week a random ECS service gets stuck in this state where TF has created it but we get the |
I've made a PR #23747 which has a partial workaround/fix for using wait_for_steady_state we've been running this for a few weeks and not had a failure once. Without wait_for_steady_state it still fails. It'll be better if someone more familiar with the module digs deeper into this for a complete fix but for now this is a workaround. |
We are also seeing this. We just upgraded to AWS provider 4.5 and never had the issue before on 3.6.x. It just has an issue with ECS fargate service that we are trying to create. I have tried to remove it from AWS directly and then run it again and keeps getting the same error. I am able to remove the new ECS service and the errors go away. We are also seeing the same issue with this as well: Implement d.IsNewResource() Checks In Resource Read Functions #16796 |
When using `wait_for_stead_state` retry up to 3 times. This is due to when a service is replaced there is a possibility the service will return a status of INACTIVE as the AWS API returns the status of the old service instead of the new one which hasn't fully registered yet.
@anGie44 I'm sorry to say that #24223 didn't work, just tried this for the first time and it did this:
|
@rwky 😞 We can move the retry back within the resource logic then since you noted above that fix had proven to work for you for some time. |
If you can make another PR I can pull that branch, build it locally and try it out. Saves us having to wait for an official release. |
Got it, it's building now I'll let you know how it goes. |
Ok I've done 9 runs and all passed, I'd say that's promising, it's not a guarantee it's fixed since it's possible to run it 9 times and have it not fail but it's a good sign. It's probably safe to merge this. |
Awesome, thanks so much for confirming and testing on your end! I agree, the result sounds more promising than the error behavior you encountered previously. I'l open up the PR for review 👍 |
@anGie44 we've been running the fixed version for over 2 weeks without issue, just wanted to say thanks and that I owe you a beer/coffee/your beverage of choice! For anyone else encountering this issue, update to the latest version of this module and enable |
Thanks for the feedback @rwky ! You played a big role in that fix so thanks again! I'm going to close this issue as it's been pretty quiet for a while, though if it resurfaces, reach out to the team and we can reopen as needed. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Community Note
Terraform CLI and Terraform AWS Provider Version
Affected Resource(s)
Terraform Configuration Files
Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.
Debug Output
Panic Output
Expected Behavior
Updated tags on ECS Fargate tasks images on 13 services
Actual Behavior
After updating 10 services the next one we got the error "Provider produced inconsistent result after apply", weirdly enough this same code was run on different clusters at the same time on diferent pipelines multiple times before without any errors.
This caused a weirder error when running the pipeline for a second time "creation: InvalidParameterException: Creation of service was not idempotent." So we commented out that service re-ran pipeline it applied, but the service was still in AWS but as INACTIVE. Had to manually destroy it and uncomment it, and the apply an the service went trough.
Steps to Reproduce
Don't have a set of steps it was random, hasn't come up since
Important Factoids
References
The text was updated successfully, but these errors were encountered: