Autoscale and masters node break the terraform plan when you have more than 6 nodes #651

kuisathaverat · 2023-05-16T12:37:15Z

An ESS deployment configured for autoscaling will not apply the terraform plan when it reaches the 6 nodes in the cluster.

Readiness Checklist

I am running the latest version
I checked the documentation and found no answer
I checked to make sure that this issue has not already been filed
I am reporting the issue to the correct repository (for multi-repository projects)

Expected Behavior

To not have to update the terraform plan manually to add master nodes

Current Behavior

The original plan is no longer valid and you have to modify the plan to include master nodes.

## Terraform definition

variable "ess_apikey" {
  type = string
}

terraform {
  required_version = ">= 0.12.29"

  required_providers {
    ec = {
      source  = "elastic/ec"
      version = "0.7.0"
    }
  }
}

provider "ec" {
  endpoint = "https://cloud.elastic.co"
  insecure = true
  apikey = var.ess_apikey
  verbose = true
}

resource "ec_deployment" "main" {
  name = "release-oblt"
  region                 = "gcp-us-west2"
  version                = "8.8.0"
  deployment_template_id = "gcp-io-optimized-v3"
  alias                  = "release-oblt"

  elasticsearch = {
    autoscale = true
    hot = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
    ml = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 1
    }
    warm = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
    cold = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
  }

  integrations_server = {
    size = "2g"
    zone_count = 1
  }
  kibana = {
    size = "4g"
    zone_count = 1
  }
}

Steps to Reproduce

Create a cluster with the terraform file provided
Ingest data to scale the cluster up to 6 nodes. Configuring ILM to have data on hot, warm, and cold tiers will be enough to have 6 nodes.
Change the terraform file to use higher autoscale memory for any of the tiers
try to apply the plan, it will fail

fatal: [localhost]: FAILED! => changed=false 
  cmd: /usr/local/bin/terraform apply -no-color -input=false -auto-approve -lock=true /tmp/tmpjwuuy4hu.tfplan
  msg: |2-
  
    Error: failed updating deployment
  
      with ec_deployment.main,
      on main.tf line 33, in resource "ec_deployment" "main":
      33: resource "ec_deployment" "main" {
  
    api error: 2 errors occurred:
            * cluster.missing_dedicated_master: Deployment template [I/O Optimized]
    requires a dedicated master after [6] nodes. Found [8] nodes in the
    deployment (resources.elasticsearch[0])
            * clusters.cluster_invalid_plan: Cluster must contain at least a master
    topology element and a data topology element. 'master' node type is
    missing,'master' node type exists in more than one topology element
    (resources.elasticsearch[0].cluster_topology)
  rc: 1
  stderr: |2-
  
    Error: failed updating deployment
  
      with ec_deployment.main,
      on main.tf line 33, in resource "ec_deployment" "main":
      33: resource "ec_deployment" "main" {
  
    api error: 2 errors occurred:
            * cluster.missing_dedicated_master: Deployment template [I/O Optimized]
    requires a dedicated master after [6] nodes. Found [8] nodes in the
    deployment (resources.elasticsearch[0])
            * clusters.cluster_invalid_plan: Cluster must contain at least a master
    topology element and a data topology element. 'master' node type is
    missing,'master' node type exists in more than one topology element
    (resources.elasticsearch[0].cluster_topology)
  stderr_lines: <omitted>
  stdout: |-
    ec_deployment.main: Modifying... [id=1111111111111111111111111111111111]
  stdout_lines: <omitted>

To fix the issue you have to modify the terraform file to include master nodes

variable "ess_apikey" {
  type = string
}

terraform {
  required_version = ">= 0.12.29"

  required_providers {
    ec = {
      source  = "elastic/ec"
      version = "0.7.0"
    }
  }
}

provider "ec" {
  endpoint = "https://cloud.elastic.co"
  insecure = true
  apikey = var.ess_apikey
  verbose = true
}

resource "ec_deployment" "main" {
  name = "release-oblt"
  region                 = "gcp-us-west2"
  version                = "8.8.0"
  deployment_template_id = "gcp-io-optimized-v3"
  alias                  = "release-oblt"

  elasticsearch = {
    autoscale = true
    hot = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
    master = {
      autoscaling = {}
      size = "8g"
      zone_count = 3
    }
    ml = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 1
    }
    warm = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
    cold = {
      autoscaling = {
        max_size = "64g"
      }
      zone_count = 3
    }
  }

  integrations_server = {
    size = "2g"
    zone_count = 1
  }
  kibana = {
    size = "4g"
    zone_count = 1
  }
}

Context

This breaks any automation, it is impossible to apply the same plan several times, and force to have something outside of terraform that updates the plan when there are more than six nodes. It is not possible to add the master nodes in the first place because you have the opposite error, you have less than six nodes so you can not provide master nodes.

Possible Solution

Not having to care about master nodes, is something ESS needs, not something I should care about it, like the ESS UI does.

#468

Check number_of_data_nodes to enable master nodes or not could make the thing

GET _cluster/health

{
  "cluster_name": "11111111111111111111111111",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 10,
  "number_of_data_nodes": 6,
  "active_primary_shards": 7075,
  "active_shards": 11991,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100
}

Your Environment

Version used: 0.7.0
Running against Elastic Cloud SaaS or Elastic Cloud Enterprise and version: Running against Elastic Cloud SaaS
Environment name and version (e.g. Go 1.9):
Server type and version:
Operating System and version:
Link to your project:

The text was updated successfully, but these errors were encountered:

shermanericts · 2023-06-02T18:09:53Z

Hmm.
As @kuisathaverat says you get somewhat hosed.
I can get into a situation where I try to add the master node I get this

│ Error: failed updating deployment
│
│   with module.customer_env.module.elastic.ec_deployment.customer,
│   on .terraform/modules/customer_env.elastic/deployment.tf line 31, in resource "ec_deployment" "customer":
│   31: resource "ec_deployment" "customer" {
│
│ api error: 1 error occurred:
│ 	* cluster.dedicated_master_prohibited: Deployment template [General purpose] requires at least [6] nodes before dedicated master can be specified. Found only [3] nodes in the deployment (resources.elasticsearch[0])
│
│

If I change the master section of the deployment to 0g, the provider complains of no master block found. I'm not sure what to do in this situation.

│ Error: failed updating deployment
│
│   with module.customer_env.module.elastic.ec_deployment.customer,
│   on ../terraform-customer-elastic/deployment.tf line 31, in resource "ec_deployment" "customer":
│   31: resource "ec_deployment" "customer" {
│
│ api error: 1 error occurred:
│ 	* clusters.cluster_invalid_plan: Cluster must contain at least a master topology element and a data topology
│ element. 'master' node type is missing,'master' node type exists in more than one topology element
│ (resources.elasticsearch[0].cluster_topology)
│
│
╵

I'm not sure how I recover at this point except for removing the elastic deployment from the state and re-importing the elastic deployment. In my case, I kept my zone count at 2 for each tier and I'm ok for the moment.

May relate to #635

Last Edit: What I wound up having to do (which is what I think was said initially) is leverage other means of lowering the zone count outside of Terraform until the master node was not in play anymore (automatically removed by Elastic Cloud) and then I was able to drive the terraform plan/apply sequence.

tobio · 2023-08-03T23:20:20Z

Duplicates #635

kuisathaverat added the bug Something isn't working label May 16, 2023

tobio closed this as completed Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoscale and masters node break the terraform plan when you have more than 6 nodes #651

Autoscale and masters node break the terraform plan when you have more than 6 nodes #651

kuisathaverat commented May 16, 2023 •

edited

Loading

shermanericts commented Jun 2, 2023 •

edited

Loading

tobio commented Aug 3, 2023

Autoscale and masters node break the terraform plan when you have more than 6 nodes #651

Autoscale and masters node break the terraform plan when you have more than 6 nodes #651

Comments

kuisathaverat commented May 16, 2023 • edited Loading

Readiness Checklist

Expected Behavior

Current Behavior

Steps to Reproduce

Context

Possible Solution

Your Environment

shermanericts commented Jun 2, 2023 • edited Loading

tobio commented Aug 3, 2023

kuisathaverat commented May 16, 2023 •

edited

Loading

shermanericts commented Jun 2, 2023 •

edited

Loading