Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource does not have attribute 'id' for variable #6991

Closed
carlossg opened this issue Jun 2, 2016 · 14 comments
Closed

Resource does not have attribute 'id' for variable #6991

carlossg opened this issue Jun 2, 2016 · 14 comments

Comments

@carlossg
Copy link
Contributor

carlossg commented Jun 2, 2016

We are getting this error from time to time, seems a race condition when AWS is slower than usual

19:29:48 [security-groups] aws_security_group.marker: Creating...
19:29:48 [security-groups] description:                "" => "Tiger security group"
19:29:48 [security-groups] egress.#:                   "" => "<computed>"
19:29:48 [security-groups] ingress.#:                  "" => "<computed>"
19:29:48 [security-groups] name:                       "" => "pse-integration-marker"
19:29:48 [security-groups] owner_id:                   "" => "<computed>"
19:29:48 [security-groups] tags.#:                     "" => "3"
19:29:48 [security-groups] tags.Name:                  "" => "pse-integration-marker"
19:29:48 [security-groups] tags.cloudbees:pse:cluster: "" => "pse-integration"
19:29:48 [security-groups] tags.tiger:cluster:         "" => "pse-integration"
19:29:48 [security-groups] vpc_id:                     "" => "vpc-9a974bfd"
19:29:49 [security-groups] aws_security_group.marker: Creation complete
19:29:49 [security-groups] Error applying plan:
19:29:49 [security-groups] 
19:29:49 [security-groups] 1 error(s) occurred:
19:29:49 [security-groups] 
19:29:49 [security-groups] * Resource 'aws_security_group.marker' does not have attribute 'id' for variable 'aws_security_group.marker.id'

the terraform.tfstate file seems corrupt, with no info about the security group created

{
    "version": 1,
    "serial": 0,
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {},
            "resources": {}
        }
    ]
}

Terraform Version

0.6.15

Affected Resource(s)

  • aws_security_group

Terraform Configuration Files

resource "aws_security_group" "marker" {
    name = "pse-integration-marker"
    description = "Tiger security group"
    tags = {
        Name = "pse-integration-marker"
        "tiger:cluster" = "pse-integration"
        "cloudbees:pse:cluster" = "pse-integration"
    }
    vpc_id = "vpc-9a974bfd"
}
output "marker_security_group" {
    value = "${aws_security_group.marker.id}"
}

References

@carlossg
Copy link
Contributor Author

@carlossg
Copy link
Contributor Author

And the terraform.tfstate and terraform.tfstate.backup that matches that log is (I have the rest of the files)

{
    "version": 1,
    "serial": 2,
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {
                "controller_security_group": "",
                "elb_security_group": "",
                "elbi_security_group": "",
                "marker_security_group": "",
                "worker_security_group": ""
            },
            "resources": {}
        }
    ]
}

@jasonf20
Copy link

jasonf20 commented Aug 31, 2016

I think we are having the same issue. Except its in 0.7.2 so it seems unresolved still.

@rwc
Copy link
Contributor

rwc commented Oct 5, 2016

Confirming this is still an issue in 0.7.4 as well.

@peculater
Copy link

And still in 0.7.7.

@apparentlymart
Copy link
Contributor

Thanks for that debug output, @carlossg. Here's what I think is the most relevant subset of it:

2016/06/15 14:44:41 [DEBUG] apply: aws_security_group.elb: executing Apply
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Security Group create configuration: {
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   Description: "PSE security group",
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   GroupName: "pse-integration-elb",
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   VpcId: "vpc-cb07aaac"
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: }
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Security Group create configuration: {
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   Description: "Tiger security group",
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   GroupName: "pse-integration-marker",
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws:   VpcId: "vpc-cb07aaac"
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: }
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [INFO] Security Group ID: sg-f2caad89
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Waiting for Security Group (sg-f2caad89) to exist
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Waiting for state to become: [exists]
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [TRACE] Waiting 100ms before next try
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [INFO] Security Group ID: sg-f1caad8a
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Waiting for Security Group (sg-f1caad8a) to exist
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [DEBUG] Waiting for state to become: [exists]
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [TRACE] Waiting 100ms before next try
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [TRACE] Waiting 200ms before next try
2016/06/15 14:44:41 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:41 [TRACE] Waiting 200ms before next try
2016/06/15 14:44:42 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:42 [TRACE] Waiting 400ms before next try
2016/06/15 14:44:42 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:42 [DEBUG] Revoking default egress rule for Security Group for sg-f1caad8a
2016/06/15 14:44:42 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:42 [DEBUG] Revoking default egress rule for Security Group for sg-f2caad89
2016/06/15 14:44:42 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:42 [DEBUG] Waiting for state to become: [success]
2016/06/15 14:44:42 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:42 [TRACE] Waiting 500ms before next try
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [DEBUG] Creating tags: [{
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Key: "cloudbees:pse:cluster",
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Value: "pse-integration"
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: } {
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Key: "tiger:cluster",
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Value: "pse-integration"
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: } {
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Key: "Name",
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Value: "pse-integration-marker"
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: }] for sg-f2caad89
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [DEBUG] Found a remote Rule that wasn't empty: (map[string]interface {}{"from_port":0, "to_port":0, "protocol":"-1", "cidr_blocks":[]string{"0.0.0.0/0"}})
aws_security_group.marker: Creation complete
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [DEBUG] Security Group create configuration: {
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   Description: "Tiger security group",
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   GroupName: "pse-integration-worker",
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws:   VpcId: "vpc-cb07aaac"
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: }
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [INFO] Security Group ID: sg-ebcaad90
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [DEBUG] Waiting for Security Group (sg-ebcaad90) to exist
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [DEBUG] Waiting for state to become: [exists]
2016/06/15 14:44:43 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:43 [TRACE] Waiting 100ms before next try
2016/06/15 14:44:44 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:44 [TRACE] Waiting 200ms before next try
2016/06/15 14:44:44 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:44 [TRACE] Waiting 400ms before next try
2016/06/15 14:44:44 [DEBUG] terraform-provider-aws: 2016/06/15 14:44:44 [DEBUG] Revoking default egress rule for Security Group for sg-ebcaad90

@apparentlymart
Copy link
Contributor

Since Terraform doesn't always include the resource id in the log output it's kinda hard to be sure which log lines belong to the processing of which security group here, but I noticed a few things that seem like plausible leads:

  • There are three different security groups created here, but only one of them (sg-f2caad89) got far enough to set its tags.
  • We do see the "Revoking default egress rule" log for all three of them, and it's not followed by "Error revoking default egress rule" so it seems like this succeeds.
  • Therefore I assume the problem is somewhere inside resourceAwsSecurityGroupUpdate, which gets called at the end of resourceAwsSecurityGroupCreate to complete the creation of the secondary objects in AWS, including the tags.
  • There is a call to d.SetId("") within the suspect codepath. Setting the id to empty would cause the described symptom of a resource being dropped from the state. No specific logging is generated in that codepath, so it's plausible but not proven that we're exiting without error there.
  • We'd end up taking that path if SGStateRefreshFunc were to get either a NotFound error from the AWS API or a nil return value from DescribeSecurityGroups. The latter seems unlikely, so I'm going to assume that for some reason we're getting that NotFound error.

So with all of this said, eventual consistency issues on the AWS end do seem to be a likely cause here; in an earlier step we verified that the security group had indeed been created, but perhaps it takes a while before the API will consistently report its creation.

Assuming all of this is the correct explanation (which I wasn't able to verify, due to not being able to repro ☹️), it feels to me like the best fix here would be for the Update function to treat a missing security group as an error rather than implicitly dropping the object from the state. This would not entirely fix the problem without also adding in some retry behavior, but it would at least stop the Update function from overstepping its bounds here (it's doing a task here that is normally reserved for the Read function) and cause Terraform to not lose track of the existing security group.

@apparentlymart
Copy link
Contributor

Over in #9719 I made some changes to make Terraform fail in a different way when this situation arises: rather than quietly dropping the resource from the state, it will instead halt with an error and write the partial resource to the state, at least allowing the operation to be retried in a subsequent run of Terraform.

I also added some logging for the case where we find during Read that the security group doesn't exist.

Neither of these things are going to actually address the problem described here, but they will hopefully confirm the theory that the EC2 API is giving us inconsistent results and we can then figure out the right way to be more resilient to that inconsistency.

@josh-padnick
Copy link

Just want to add another data point here. I was mysteriously getting the following error consistently (i.e. not an AWS eventual consistency issue):

Resource 'aws_iam_role.ecs_service_autoscaling_role' does not have attribute 'id' for variable 'aws_iam_role.ecs_service_autoscaling_role.id'

I finally discovered that the real issue was that aws_iam_role.ecs_service_autoscaling_role wasn't actually getting created. In fact, Terraform was failing with this error:

* aws_iam_role.ecs_service_autoscaling_role: "name" cannot be longer than 64 characters

But because the execution kept running, the error message I saw wasn't helpful.

carlossg added a commit to cloudbees/terraform that referenced this issue Jan 23, 2017
…stency issues

It appears, based on the report in hashicorp#6991, that the EC2 API is being
inconsistent in reporting that a security group exists shortly after it
has been created; we've seen Terraform get past the "Waiting for
Security Group to exist" step but then apparently detect that it's gone
again once we get into the Update function.
carlossg added a commit to cloudbees/terraform that referenced this issue Jan 23, 2017
…stency issues

It appears, based on the report in hashicorp#6991, that the EC2 API is being
inconsistent in reporting that a security group exists shortly after it
has been created; we've seen Terraform get past the "Waiting for
Security Group to exist" step but then apparently detect that it's gone
again once we get into the Update function.
@dendrochronology
Copy link

I just saw this as well. One of my route tables wasn't created, so the dependent resources error-ed out. Running terraform apply again fixed it. The output during the first apply looked fine; both route tables said Creating..., with the values I'd normally expect.

jammerful added a commit to jammerful/terraform-provider-aws that referenced this issue Mar 22, 2018
Make security groups more resilient to eventual consistency errors
by adding retries on the existence function in read and update.

See: hashicorp/terraform#6991
@apparentlymart apparentlymart added config and removed core labels Nov 7, 2018
@kenorb
Copy link

kenorb commented Mar 19, 2019

I've got similar error for aws_rds_cluster on destroy with Terraform v0.11.11, e.g.

Releasing state lock. This may take a few moments...

Error: Error applying plan:

1 error(s) occurred:

* local.environment_json: local.environment_json: Resource 'aws_rds_cluster.myproject_database' does not have attribute 'database_name' for variable 'aws_rds_cluster.myproject_database.database_name'

despite my resource got database_name. In another run, it complains about master_password:

* local.environment_json: local.environment_json: Resource 'aws_rds_cluster.myproject_database' does not have attribute 'master_password' for variable 'aws_rds_cluster.myproject_database.master_password'

And somebody got similar issue at: https://docs.cloudposse.com/troubleshooting/error-applying-terraform-plan/

@etwillbefine
Copy link

I got the same issue with an aws_rds_instance. The problem was that I passed in an AWS KMS alias and not a valid ARN. It seems like the provider validates and the error is not catched correctly or something similar.

Found out about the Alias vs ARN issue by running TF_LOG=debug terraform plan

@teamterraform
Copy link
Contributor

Hi all,

We had a few different root causes leading to errors like this in Terraform 0.11 and earlier. Eventual consistency was one such problem, but the general concern was that in earlier versions Terraform would not perform thorough checks on the consistency of what is returned by a provider, and thus a provider behaving oddly would usually lead to a confusing downstream error with insufficient context.

Terraform 0.12 includes some fixes for known issues in this area, and it also includes improved safety checks so that provider inconsistencies can be caught earlier and reported with more context. The specific codepath that generated the errors discussed in this issue doesn't exist anymore in Terraform 0.12, so we're going to close this one out under the assumption that all of the reports here were caused by issues that we found and fixed in the Terraform 0.12 cycle.

If you are using Terraform 0.12 and are still running into weird errors that feel similar to those here (although the exact text will be different, due to the rewrite of this portion), please do open a new issue for it so we can capture some updated reproduction information against the new codepaths. Thanks for reporting this!

@ghost
Copy link

ghost commented Aug 19, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Aug 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests