Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data resource in a module is not delayed #23242

Closed
josh-barker opened this issue Oct 31, 2019 · 6 comments
Closed

Data resource in a module is not delayed #23242

josh-barker opened this issue Oct 31, 2019 · 6 comments

Comments

@josh-barker
Copy link

Terraform Version

terraform v0.12.9 and v0.12.12

provider.azurerm v1.36.1

Terraform Configuration Files

Part 1

module "rg_1" {
  source = "terraform-azurerm-resource-group"

  name     = "rg-1"
  location = "australiasoutheast"
}

Part 2

  1. Update code to add a resource group and RSV.
  2. On a 2nd plan or apply, an error is raised saying that the resource group cannot be found, from a data object in the rsv module.
# existing code
module "rg_1" {
  source = "terraform-azurerm-resource-group"

  name     = "rg-1"
  location = "australiasoutheast"
}

# new code
module "rg_2" {
  source = "terraform-azurerm-resource-group"

  name     = "rg-2"
  location = "australiasoutheast"
}

# Note: This module uses data to retrieve the location for the resource group
module "rsv" {
  source = "terraform-azurerm-rsv"
   resource_group_name = module.rg_2.resource_group_name
}

Note: The modules I'm using are internal and not publicly available.

Expected Behavior

I expect that the plan/apply will succeed and create the resource group and RSV.

Actual Behavior

terraform apply
module.rg_1.azurerm_resource_group.rg[0]: Refreshing state... [id=/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-1]
module.rsv.data.azurerm_resource_group.this: Refreshing state...

Error: Error: Resource Group "rg-2" was not found

  on .terraform/modules/rsv/main.tf line 6, in data "azurerm_resource_group" "this":
   6: data "azurerm_resource_group" "this" {

Steps to Reproduce

  1. terraform init
  2. terraform apply
  3. Add part 2 of code
  4. terraform apply
    => Error

Another scenario - All code written for first deploy

However, if I have all the code with no state stored yet, it behaves as I expect.

terraform plan
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create
 <= read (data resources)

Terraform will perform the following actions:

  # module.rg_2.azurerm_resource_group.rg[0] will be created
  + resource "azurerm_resource_group" "rg" {
      + id       = (known after apply)
      + location = "australiasoutheast"
      + name     = "rg-2"
      + tags     = (known after apply)
    }

  # module.rsv.data.azurerm_resource_group.this will be read during apply
  # (config refers to values not yet known)
 <= data "azurerm_resource_group" "this"  {
      + id       = (known after apply)
      + location = (known after apply)
      + name     = "rg-2"
      + tags     = (known after apply)

      + timeouts {
          + read = (known after apply)
        }
    }

  # module.rsv.azurerm_recovery_services_vault.this[0] will be created
  + resource "azurerm_recovery_services_vault" "this" {
      + id                  = (known after apply)
      + location            = (known after apply)
      + name                = (known after apply)
      + resource_group_name = "rg-2"
      + sku                 = "Standard"
      + tags                = (known after apply)
    }

  # module.rg_1.azurerm_resource_group.rg[0] will be created
  + resource "azurerm_resource_group" "rg" {
      + id       = (known after apply)
      + location = "australiasoutheast"
      + name     = "rg-1"
      + tags     = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.
@teamterraform
Copy link
Contributor

Hi @josh-barker! Thanks for reporting this.

From looking through your configuration and the output you've shared, it looks like this is happening because module.rg_2.resource_group_name is already known during planning and therefore as far as Terraform is concerned the data resource is ready to read. Terraform only delays reading a data resource if its configuration includes a value that can't be determined until the apply step.

The pattern of reading something using a data resource in the same configuration where it's created does tend to lead to this sort of problem, and so we'd recommend avoiding that where possible. In this case, it looks like it could be avoided by passing the resource group as a whole between modules, rather than having one module write it to the API and another module then read it back. The required data is already in Terraform, so we can pass it around in a module composition style rather than having the terraform-azurerm-rsv module go fetch the object itself:

module "rg_2" {
  source = "./terraform-azurerm-resource-group"

  name     = "rg-2"
  location = "australiasoutheast"
}

module "rsv" {
  source = "./terraform-azurerm-rsv"

  resource_group = module.rg_2.resource_group
}

Inside the terraform-azurerm-rsv module you can declare the resource_group variable like this to make it expect an object value:

variable "resource_group" {
  type = object({
    # Only need to include the subset of attributes that the module
    # actually relies on. An object type constraint accepts any object
    # that has _at least_ the given attributes, ignoring any extras.
    name     = string
    location = string
  })
}

You can then, for example, access the resource group location in the module as var.resource_group.location without needing to make a request to the remote API. Because Terraform data sources conventionally have the same or similar attributes as the corresponding managed resource type, you can often write the variable type constraint as the common subset of both and thus allow the calling module to either pass in an object it is directly managing or consume the object from elsewhere, depending on what makes sense in that context:

# If we put the data resource in the caller rather than in the child module
# then the module doesn't need to be aware of whether this is an object
# managed in this configuration or whether it's managed elsewhere.
data "azurerm_resource_group" "example" {
  name = "rg-2"
}

module "rsv" {
  source = "./terraform-azurerm-rsv"

  # The data resource's object type also matches the variable
  # type constraint, so we can pass it directly here.
  resource_group = data.azurerm_resource_group.example
}

With that said, Terraform is behaving as designed in the situation you described here, but we can see that Terraform is not giving good feedback on what the problem might be and what changes you might make to solve it. We're going to relabel this as an enhancement to represent exploration either of giving better feedback in this situation or potentially introducing a feature to make this situation more convenient to express in configuration. We're not sure at the moment what form that might take, but we want to retain the record of the use-case for future discussion and design work.

Thanks again for reporting this! Hopefully the above is helpful guidance for expressing your intent within the current design of the Terraform language.

@josh-barker
Copy link
Author

Thanks for the detailed explanation @teamterraform

The key thing that is confusing is when we deploy scenario 2 and everything works, compared with an additive deployment in scenario 1.

@jstewmon
Copy link

Hi @teamterraform , I have a similar issue using an aws_iam_policy_document data source that leads me to believe there is an actual bug using data sources that depend on resources defined in a module.

Specifically, if an aws_iam_policy_document data source references an output from a module, the data source should not be read until the module resource has been changed.

If I use a single configuration file with the following contents, changing the random_length variable will reliably update aws_iam_role_policy.roleB with the arn of aws_iam_role.roleA:

provider "aws" {
  region = "us-east-1"
}

variable "random_length" {
  type        = number
  description = "The number of random characters to use in the name"
  default     = 4
}

resource "random_string" "suffix" {
  length  = var.random_length
  upper   = false
  special = false
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "roleA-policy" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect    = "Allow"
    resources = [aws_iam_role.roleA.arn]
  }
}

data "aws_iam_policy_document" "roleA-arp" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect = "Allow"
    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/Staff"
      ]
    }
  }
}

resource "aws_iam_role" "roleA" {
  name               = "RoleA-${random_string.suffix.result}"
  assume_role_policy = data.aws_iam_policy_document.roleA-arp.json
}

resource "aws_iam_role_policy" "roleA" {
  name   = "RoleA-AssumeRole"
  role = aws_iam_role.roleA.id
  policy = data.aws_iam_policy_document.roleA-policy.json
}

data "aws_iam_policy_document" "roleB-policy" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect    = "Allow"
    resources = [aws_iam_role.roleA.arn]
  }
}

data "aws_iam_policy_document" "roleB-arp" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect = "Allow"
    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/Staff"
      ]
    }
  }
}

resource "aws_iam_role" "roleB" {
  name               = "RoleB"
  assume_role_policy = data.aws_iam_policy_document.roleB-arp.json
}

resource "aws_iam_role_policy" "roleB" {
  name   = "RoleB-AssumeRole"
  role = aws_iam_role.roleB.id
  policy = data.aws_iam_policy_document.roleB-policy.json
}

However, if I move roleA to a module which is consume by a config that creates roleB, then I will have to run terraform apply twice if random_length is changed:

roleA/main.tf:

provider "aws" {
  region = "us-east-1"
}

variable "random_length" {
  type        = number
  description = "The number of random characters to use in the name"
  default     = 4
}

resource "random_string" "suffix" {
  length  = var.random_length
  upper   = false
  special = false
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "roleA-policy" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect    = "Allow"
    resources = [aws_iam_role.roleA.arn]
  }
}

data "aws_iam_policy_document" "roleA-arp" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect = "Allow"
    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/Staff"
      ]
    }
  }
}

resource "aws_iam_role" "roleA" {
  name               = "RoleA-${random_string.suffix.result}"
  assume_role_policy = data.aws_iam_policy_document.roleA-arp.json
}

resource "aws_iam_role_policy" "roleA" {
  name   = "RoleA-AssumeRole"
  role = aws_iam_role.roleA.id
  policy = data.aws_iam_policy_document.roleA-policy.json
}

output "role" {
  value = aws_iam_role.roleA
}

roleB/main.tf:

provider "aws" {
  region = "us-east-1"
}

module "ext-role" {
  source = "../roleA"
  random_length = 3
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "roleB-policy" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect    = "Allow"
    resources = [module.ext-role.role.arn]
  }
}

data "aws_iam_policy_document" "roleB-arp" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]
    effect = "Allow"
    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/Staff"
      ]
    }
  }
}

resource "aws_iam_role" "roleB" {
  name               = "RoleB"
  assume_role_policy = data.aws_iam_policy_document.roleB-arp.json
}

resource "aws_iam_role_policy" "roleB" {
  name   = "RoleB-AssumeRole"
  role = aws_iam_role.roleB.id
  policy = data.aws_iam_policy_document.roleB-policy.json
}

The first apply will report # module.ext-role.aws_iam_role.roleA must be replaced but aws_iam_role_policy.roleB will not be part of the plan. Subsequently running apply will plan # aws_iam_role_policy.roleB will be updated in-place

@w0ut0
Copy link

w0ut0 commented Jun 18, 2020

Is this the same issue as this?

@jbardin
Copy link
Member

jbardin commented Sep 22, 2020

This appears to be a variation of #25961, which is closed by #26284 and will be included in the 0.14 release.

The caveat here is that implicitly delaying data source updates even when the configuration is known will be limited to direct references to other resource. This can be worked around now with depends_on, since that feature now works for both data sources and modules.

The new data lifecycle changes here should work to make the behavior easier to understand, and provide adequate flexibility when data sources depend on manages resource.

Thanks!

@jbardin jbardin closed this as completed Sep 22, 2020
@ghost
Copy link

ghost commented Oct 23, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked as resolved and limited conversation to collaborators Oct 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants