Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error acquiring the state lock when executing task when backend set as consul #546

Open
sameer666 opened this issue Dec 10, 2021 · 3 comments
Labels
bug Something isn't working waiting-reply Waiting on response from Original Poster or another individual in the thread

Comments

@sameer666
Copy link

sameer666 commented Dec 10, 2021

Describe the bug

When not setting the backend manually in driver "terraform" block, it sets it as consul.
When trying to execute the task, Terraform is trying to acquire the state lock but it is failing with the following error:

2021-12-16T16:01:29.834+0530 [INFO] ctrl: driver initialized
2021-12-16T16:01:29.834+0530 [INFO] ctrl: executing all tasks once through
2021-12-16T16:01:29.835+0530 [DEBUG] ctrl: watching dependencies: dependency_size=2
2021-12-16T16:01:30.057+0530 [DEBUG] driver.terraform: change detected for task: task_name=web
2021-12-16T16:01:30.081+0530 [INFO] ctrl: executing task: task_name=web
2021-12-16T16:01:35.681+0530 [ERROR] cli: error running controller in Once mode:
error=
| could not apply changes for task web: error tf-apply for 'web': exit status 1
|
| Error: Error loading state: failed to lock state in Consul:
|


On setting the backend as local, the execution works as expected.

When trying to run the module directly instead of through consul terraform sync, it works with the backend set as consul.

Versions

Consul Terraform Sync

consul-terraform-sync v0.4.2 (bc2b2a0)
Compatible with Terraform >= 0.13.0, < 1.1.0

Consul Version

Consul 1.10.4

Terraform Version

Terraform v1.0.11

Configuration File(s)

using CTS to monitor 2 services in consul and call a module to create dynamic object mapping on Cisco FMC
log_level = "DEBUG"

consul {
  address = <redacted>
}

driver "terraform" {
  version = "1.0.11"
  required_providers {
    fmc = {
      source = "CiscoDevNet/fmc"
      version = "0.2.1"
    }
  }
}

terraform_provider "fmc" {
  fmc_username = <redacted>
  fmc_password = <redacted>
  fmc_host = <redacted>
  fmc_insecure_skip_verify = true
}

task {
  name        = "web"
  description = "update policies based on node availability"
  source      = "home/user/terraform-fmc-dynamicobject" 
  providers   = ["fmc"]
  services    = ["web","api"]
}

Terraform Configuration Files Generated by Consul-Terraform-Sync

Click to toggle contents of main.tf
# This file is generated by Consul Terraform Sync.
#
# The HCL blocks, arguments, variables, and values are derived from the
# operator configuration for Sync. Any manual changes to this file
# may not be preserved and could be overwritten by a subsequent update.
#
# Task: web
# Description: update policies based on node availability

terraform {
  required_version = ">= 0.13.0, < 1.1.0"
  required_providers {
    fmc = {
      source  = "CiscoDevNet/fmc"
      version = "0.2.1"
    }
  }
  backend "consul" {
    address = <redacted>
    gzip    = true
    path    = "consul-terraform-sync/terraform"
  }
}

provider "fmc" {
  fmc_host                 = var.fmc.fmc_host
  fmc_insecure_skip_verify = var.fmc.fmc_insecure_skip_verify
  fmc_password             = var.fmc.fmc_password
  fmc_username             = var.fmc.fmc_username
}

# update policies based on node availability
module "web" {
  source   = "/home/user/terraform-fmc-dynamicobject"
  services = var.services
}
Click to toggle contents of terraform.tfvars
# This file is generated by Consul Terraform Sync.
#
# The HCL blocks, arguments, variables, and values are derived from the
# operator configuration for Sync. Any manual changes to this file
# may not be preserved and could be overwritten by a subsequent update.
#
# Task: web
# Description: update policies based on node availability

services = {
  "api.ip-1-1-1-1.dc1" = {
    id              = "api"
    name            = "api"
    kind            = ""
    address         = "1.1.1.1"
    port            = 9090
    meta            = {}
    tags            = []
    namespace       = ""
    status          = "passing"
    node            = "ip-1-1-1-1"
    node_id         = ""
    node_address    = "1.1.1.1"
    node_datacenter = "dc1"
    node_tagged_addresses = {
      lan      = "1.1.1.1"
      lan_ipv4 = "1.1.1.1"
      wan      = "1.1.1.1"
      wan_ipv4 = "1.1.1.1"
    }
    node_meta = {
      consul-network-segment = ""
    }
    cts_user_defined_meta = {}
  },
}

Expected Behavior

Task gets executed and dynamic objects are updated with the new mappings of IP Addresses

Actual Behavior

Task is not getting executed with the error
Error acquiring the state lock when executing task

@sameer666 sameer666 added the bug Something isn't working label Dec 10, 2021
@mkam
Copy link
Contributor

mkam commented Dec 10, 2021

Hi @sameer666, thanks for reporting this issue! I've got a few questions to help us debug and reproduce this problem.

  1. It looks like the CTS configuration and generated Terraform configuration you've provided is the working config since it has the backend set to local. Could you update the CTS config and main.tf with the files where Consul is the backend?

  2. Is the session in the error message an empty string or have you redacted the actual value? Could you edit the removed values to be <redacted> so that we can distinguish them from empty strings?

  3. Can you give an overview of your Consul setup and would you be able to share any relevant Consul logs for the 500 error?

  4. Consul has a list of situations where the session is invalidated here. Do you think any of these scenarios could be happening while you are running CTS?

  5. Is the error happening when CTS is first started or is it happening while CTS is running after initialization has completed?

@mkam mkam added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Dec 13, 2021
@sameer666
Copy link
Author

  1. Edited with the config that is causing the issue
  2. Edited
  3. I have EC2 instances setup in AWS running the consul agents. 2 consul servers behind load balacer, one web server running consul agent and service running on it which is registered to consul server. I am running consul terraform sync on my local system and the target device Cisco FMC is also hosted on AWS
  4. Not matching any case
  5. It is happening when CTS detects a change and task needs to be executed.

@mkam mkam removed the waiting-reply Waiting on response from Original Poster or another individual in the thread label Dec 16, 2021
@mkam
Copy link
Contributor

mkam commented Dec 21, 2021

Thanks for the clarifications! Nothing is standing out to me as the root cause of your issue, and I haven't been able to reproduce it. Here are some debugging steps you could try next:

  1. Could you delete the working directory for the task and running CTS again? It should be sync-tasks/<taskname>
  2. Could you try to delete the backend in Consul KV? The command to do so is consul kv get consul-terraform-sync/terraform-env:<taskname>
  3. Could you comment out your task and configure a different test task? One example you can use is:
task {
  name = "test-task"
  source = "mkam/hello/cts"
  providers = ["local"]
  services = ["web"]
}

@jwair jwair added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working waiting-reply Waiting on response from Original Poster or another individual in the thread
Projects
None yet
Development

No branches or pull requests

3 participants