Destroy/recreate DB instance on minor version update rather than updating #9401

gbataille · 2019-07-18T11:34:31Z

Terraform Version

Terraform v0.12.3

provider.aws v2.16.0
provider.template v2.1.2

Affected Resource(s)

aws_rds_cluster
aws_rds_cluster_instance

Terraform Configuration Files

resource "aws_rds_cluster" "main_postgresql" {
  cluster_identifier           = "aurora-cluster-main"
  deletion_protection          = false
  availability_zones           = ['us-east-1a', 'us-east-1b', 'us-east-1c']
  database_name                = "pcs"
  skip_final_snapshot          = true
  backup_retention_period      = 5
  preferred_backup_window      = "03:00-05:00"
  preferred_maintenance_window = "Mon:05:00-Mon:06:00"
  vpc_security_group_ids       = [aws_security_group.main_postgresql.id]
  storage_encrypted            = true

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = "aurora-postgresql"
  engine_version       = "10.7"
  db_subnet_group_name = aws_db_subnet_group.main.name

  master_username = "admin"

  master_password = "duMMY$123"

  apply_immediately = true
}

resource "aws_rds_cluster_instance" "main_postgresql_instances" {
  count                      = 2
  identifier_prefix          = "aurora-cluster-main-instance-"
  cluster_identifier         = aws_rds_cluster.main_postgresql.id
  publicly_accessible        = true
  instance_class             = var.db_instance_type_per_env[terraform.workspace]
  auto_minor_version_upgrade = false

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = local.engine
  engine_version       = local.engine_version
  db_subnet_group_name = aws_db_subnet_group.main.name

  apply_immediately = true
}

Debug Output

https://gist.github.com/gbataille/9c7b6084614b1b6c022342c48dbb80f7

Expected Behavior

DB cluster and DB instances are upgraded in place like if you did it through the AWS console.
If you do it from the AWS console, the cluster and the instances are put in upgrading status, a dump is taken, pg_upgrade is run live, the instances are rebooted (~10s) and everything is back up.

Actual Behavior

Instances are destroyed and new ones with the new minor version are re-created
--> it takes way longer
--> the downtime is way longer.
Luckily, since it's Aurora and the data layer is separate from the engine, no data was lost.

Steps to Reproduce

terraform apply with a RDS Aurora specifying postgresql 10.6
terraform apply with a RDS Aurora specifying postgresql 10.7

The text was updated successfully, but these errors were encountered:

nijave · 2019-10-19T03:40:53Z

Hmm looks like that uses the same api as aws_db_instance but has different settings on that parameter 🤔

$ git diff aws/resource_aws_rds_cluster_instance.go
diff --git a/aws/resource_aws_rds_cluster_instance.go b/aws/resource_aws_rds_cluster_instance.go
index 02e8e94a9..1ca54e226 100644
--- a/aws/resource_aws_rds_cluster_instance.go
+++ b/aws/resource_aws_rds_cluster_instance.go
@@ -99,10 +99,10 @@ func resourceAwsRDSClusterInstance() *schema.Resource {
                        },
 
                        "engine_version": {
-                               Type:     schema.TypeString,
-                               Optional: true,
-                               ForceNew: true,
-                               Computed: true,
+                               Type:             schema.TypeString,
+                               Optional:         true,
+                               Computed:         true,
+                               DiffSuppressFunc: suppressAwsDbEngineVersionDiffs,
                        },
 
                        "db_parameter_group_name": {

pioneer2k · 2019-10-22T18:15:15Z

The same for me while upgrading the minor version of an Aurora MySQL Database.
See below for "# forces replacement".
The only workaround is to manually update the version via AWS console and after finish to update/align the Terraform source files -> very fragile!

# aws_rds_cluster_instance.customerscoring_unittest_rds_cluster_instance must be replaced
 -/+ resource "aws_rds_cluster_instance" "customerscoring_unittest_rds_cluster_instance" {
        apply_immediately               = true
      ~ arn                             = "arn:aws:rds:eu-central-1:XXXXXXXXXXX:db:customerscoringunittest" -> (known after apply)
        auto_minor_version_upgrade      = true
      ~ availability_zone               = "eu-central-1b" -> (known after apply)
        cluster_identifier              = "customerscoringunittest-cluster"
        copy_tags_to_snapshot           = true
        db_parameter_group_name         = "customerscoringqa-aurora-mysql57"
        db_subnet_group_name            = "privat"
      ~ dbi_resource_id                 = "db-H54JTW27MWJTJTPUJVNLTXEH7I" -> (known after apply)
      ~ endpoint                        = "customerscoringunittest.co4pdundcaoq.eu-central-1.rds.amazonaws.com" -> (known after apply)
        engine                          = "aurora-mysql"
      ~ engine_version                  = "5.7.mysql_aurora.2.04.6" -> "5.7.mysql_aurora.2.05.0" # forces replacement
      ~ id                              = "customerscoringunittest" -> (known after apply)
        identifier                      = "customerscoringunittest"
      + identifier_prefix               = (known after apply)
        instance_class                  = "db.t3.medium"
      ~ kms_key_id                      = "arn:aws:kms:eu-central-1:XXXXXXXXXX:key/0000000-1111-acbb-80e5-1fb4254b6666" -> (known after apply)
        monitoring_interval             = 0
      + monitoring_role_arn             = (known after apply)
      ~ performance_insights_enabled    = false -> (known after apply)
      + performance_insights_kms_key_id = (known after apply)
      ~ port                            = 3306 -> (known after apply)
      ~ preferred_backup_window         = "21:43-22:43" -> (known after apply)
      ~ preferred_maintenance_window    = "mon:02:32-mon:03:02" -> (known after apply)
        promotion_tier                  = 1
        publicly_accessible             = false
      ~ storage_encrypted               = true -> (known after apply)
      ~ writer                          = true -> (known after apply)
    }

jcarlson · 2019-11-08T02:02:03Z

Try ignoring changes to the engine version on the aws_rds_cluster_instance resource.

resource "aws_rds_cluster" "main" {
  apply_immediately  = true
  cluster_identifier = "my-cluster"
  engine             = "aurora-postgresql"
  engine_version     = "10.7"

  # other attributes omitted
}

resource "aws_rds_cluster_instance" "cluster_instance" {
  apply_immediately  = true
  identifier_prefix  = "my-instance"
  cluster_identifier = aws_rds_cluster.main.id
  engine             = aws_rds_cluster.main.engine
  engine_version     = aws_rds_cluster.main.engine_version

  # other attributes omitted

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [engine_version]
  }
}

My simple experimentation shows that when you terraform apply an engine version change on the cluster resource, AWS upgrades the cluster instances at the same time, thus negating the need to update the cluster instances with Terraform.

Note that I've also marked the cluster instances as create_before_destroy, so that if Terraform does insist on replacing the instance, it will spin up a replacement instance first and this will minimize downtime.

brianmori · 2019-11-22T06:46:46Z

AWS Provider: 2.38
Terraform: 0.12.13

We have the same issue with aurora, but the instances once destroyed they cannot be recreated

module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Destroying... [id=abc-dev-0]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Destroying... [id=abc-dev-1]
module.abc-eks-customer-quality.aws_launch_configuration.workers[1]: Creating...
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifying... [id=abc-dev]
module.abc-eks-customer-quality.aws_launch_configuration.workers[0]: Creating...
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifications complete after 1m14s [id=abc-dev]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m20s elapsed]

Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
	status code: 400, request id: 6b737240-b302-4ac6-b632-ae9a1e632960

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {



Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
	status code: 400, request id: 4c6ee853-ad2f-4706-ade7-d2a8968d4f98

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {

pioneer2k · 2019-11-28T13:21:41Z

As @jcarlson already wrote the solution is to work with the engine_version on the cluster only and leave the engine_version on the cluster_instance out, since it is optional. When doing so, Terraform does an inplace upgrade of the cluster and AWS RDS upgrades the cluster_instance's itself. Terraform then sees no difference on the cluster_instance and does nothing.

marinsalinas · 2020-06-26T19:54:08Z

@pioneer2k have you tried that with global_clusters? I tried but seems like on rds_global_cluster we need to specify the same version of the rds_clusters.

marinsalinas · 2020-06-26T19:56:05Z

@nywilken this issue is related to service/rds not to service/dynamodb

AndrewAyush · 2020-11-11T06:30:23Z

Hello Guys, I have created a template
resource "aws_rds_cluster" "default" {
cluster_identifier = var.name
engine = "aurora-mysql"
engine_mode = "serverless"
engine_version = "5.7.mysql_aurora.2.07.1"
availability_zones = ["us-east-2a", "us-east-2b"]
master_username = var.database_username
master_password = var.database_password
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
preferred_backup_window = "07:00-09:00"
db_subnet_group_name = aws_db_subnet_group.rds.id
db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.aurora_db_57.id
final_snapshot_identifier = "${var.name}-final"
skip_final_snapshot = false
deletion_protection = true
apply_immediately = true
scaling_configuration {
min_capacity = var.min_capacity
auto_pause = true
max_capacity = var.max_capacity
seconds_until_auto_pause = 300
timeout_action = "ForceApplyCapacityChange"
}

if I change anything in this template, it will delete the rds and create it again. is there a way where we can only modifying the rds instead of deleting?

bill-rich · 2021-03-11T01:11:16Z

I was not able to reproduce this issue. In all the cases I tried, upgrades worked fine as long as the engine version was managed in aws_rds_cluster rather than in aws_rds_cluster_instance. I'm going to close this issue since I can't reproduce it. If anyone has an example config and steps to reproduce it, please post them and I'll get the issue reopened.

marinsalinas · 2021-03-17T20:58:34Z

@bill-rich quick question, that means that we can omit the engine_version for rds_cluster_instance and only manage this parameter on the rds_cluster level?

Also what about using a global cluster? every engine_version change requires recreation when uses a global cluster: https://github.com/hashicorp/terraform-provider-aws/blob/main/aws/resource_aws_rds_global_cluster.go#L61

We can manage this with a similar approach?

bill-rich · 2021-03-18T22:13:31Z

Hi @marinsalinas! That is correct on only including engine_version in the rds_cluster config. rds_global_cluster still requires a full destroy and create for engine_version updates, but it looks like the API does support updating. The global_cluster resource will need more changes and testing to get into a state where this can be supported. I will be looking into this over the next couple days. I'll track the progress in #18214.

ghost · 2021-04-10T17:10:00Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

nywilken added the service/dynamodb Issues and PRs that pertain to the dynamodb service. label Oct 11, 2019

maryelizbeth added needs-triage Waiting for first response or review from a maintainer. service/rds Issues and PRs that pertain to the rds service. and removed service/dynamodb Issues and PRs that pertain to the dynamodb service. labels Aug 13, 2020

maryelizbeth added enhancement Requests to existing resources that expand the functionality or scope. and removed needs-triage Waiting for first response or review from a maintainer. labels Sep 2, 2020

bill-rich closed this as completed Mar 11, 2021

ghost locked as resolved and limited conversation to collaborators Apr 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Destroy/recreate DB instance on minor version update rather than updating #9401

Destroy/recreate DB instance on minor version update rather than updating #9401

gbataille commented Jul 18, 2019

nijave commented Oct 19, 2019

pioneer2k commented Oct 22, 2019

jcarlson commented Nov 8, 2019

brianmori commented Nov 22, 2019

pioneer2k commented Nov 28, 2019

marinsalinas commented Jun 26, 2020

marinsalinas commented Jun 26, 2020

AndrewAyush commented Nov 11, 2020

bill-rich commented Mar 11, 2021

marinsalinas commented Mar 17, 2021

bill-rich commented Mar 18, 2021

ghost commented Apr 10, 2021

Destroy/recreate DB instance on minor version update rather than updating #9401

Destroy/recreate DB instance on minor version update rather than updating #9401

Comments

gbataille commented Jul 18, 2019

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

nijave commented Oct 19, 2019

pioneer2k commented Oct 22, 2019

jcarlson commented Nov 8, 2019

brianmori commented Nov 22, 2019

pioneer2k commented Nov 28, 2019

marinsalinas commented Jun 26, 2020

marinsalinas commented Jun 26, 2020

AndrewAyush commented Nov 11, 2020

bill-rich commented Mar 11, 2021

marinsalinas commented Mar 17, 2021

bill-rich commented Mar 18, 2021

ghost commented Apr 10, 2021