Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroy/recreate DB instance on minor version update rather than updating #9401

Closed
gbataille opened this issue Jul 18, 2019 · 12 comments
Closed
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/rds Issues and PRs that pertain to the rds service.

Comments

@gbataille
Copy link

Terraform Version

Terraform v0.12.3

  • provider.aws v2.16.0
  • provider.template v2.1.2

Affected Resource(s)

  • aws_rds_cluster
  • aws_rds_cluster_instance

Terraform Configuration Files

resource "aws_rds_cluster" "main_postgresql" {
  cluster_identifier           = "aurora-cluster-main"
  deletion_protection          = false
  availability_zones           = ['us-east-1a', 'us-east-1b', 'us-east-1c']
  database_name                = "pcs"
  skip_final_snapshot          = true
  backup_retention_period      = 5
  preferred_backup_window      = "03:00-05:00"
  preferred_maintenance_window = "Mon:05:00-Mon:06:00"
  vpc_security_group_ids       = [aws_security_group.main_postgresql.id]
  storage_encrypted            = true

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = "aurora-postgresql"
  engine_version       = "10.7"
  db_subnet_group_name = aws_db_subnet_group.main.name

  master_username = "admin"

  master_password = "duMMY$123"

  apply_immediately = true
}

resource "aws_rds_cluster_instance" "main_postgresql_instances" {
  count                      = 2
  identifier_prefix          = "aurora-cluster-main-instance-"
  cluster_identifier         = aws_rds_cluster.main_postgresql.id
  publicly_accessible        = true
  instance_class             = var.db_instance_type_per_env[terraform.workspace]
  auto_minor_version_upgrade = false

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = local.engine
  engine_version       = local.engine_version
  db_subnet_group_name = aws_db_subnet_group.main.name

  apply_immediately = true
}

Debug Output

https://gist.github.com/gbataille/9c7b6084614b1b6c022342c48dbb80f7

Expected Behavior

DB cluster and DB instances are upgraded in place like if you did it through the AWS console.
If you do it from the AWS console, the cluster and the instances are put in upgrading status, a dump is taken, pg_upgrade is run live, the instances are rebooted (~10s) and everything is back up.

Actual Behavior

Instances are destroyed and new ones with the new minor version are re-created
--> it takes way longer
--> the downtime is way longer.
Luckily, since it's Aurora and the data layer is separate from the engine, no data was lost.

Steps to Reproduce

  1. terraform apply with a RDS Aurora specifying postgresql 10.6
  2. terraform apply with a RDS Aurora specifying postgresql 10.7
@nywilken nywilken added the service/dynamodb Issues and PRs that pertain to the dynamodb service. label Oct 11, 2019
@nijave
Copy link
Contributor

nijave commented Oct 19, 2019

Hmm looks like that uses the same api as aws_db_instance but has different settings on that parameter 🤔

$ git diff aws/resource_aws_rds_cluster_instance.go
diff --git a/aws/resource_aws_rds_cluster_instance.go b/aws/resource_aws_rds_cluster_instance.go
index 02e8e94a9..1ca54e226 100644
--- a/aws/resource_aws_rds_cluster_instance.go
+++ b/aws/resource_aws_rds_cluster_instance.go
@@ -99,10 +99,10 @@ func resourceAwsRDSClusterInstance() *schema.Resource {
                        },
 
                        "engine_version": {
-                               Type:     schema.TypeString,
-                               Optional: true,
-                               ForceNew: true,
-                               Computed: true,
+                               Type:             schema.TypeString,
+                               Optional:         true,
+                               Computed:         true,
+                               DiffSuppressFunc: suppressAwsDbEngineVersionDiffs,
                        },
 
                        "db_parameter_group_name": {

@pioneer2k
Copy link

The same for me while upgrading the minor version of an Aurora MySQL Database.
See below for "# forces replacement".
The only workaround is to manually update the version via AWS console and after finish to update/align the Terraform source files -> very fragile!

# aws_rds_cluster_instance.customerscoring_unittest_rds_cluster_instance must be replaced
 -/+ resource "aws_rds_cluster_instance" "customerscoring_unittest_rds_cluster_instance" {
        apply_immediately               = true
      ~ arn                             = "arn:aws:rds:eu-central-1:XXXXXXXXXXX:db:customerscoringunittest" -> (known after apply)
        auto_minor_version_upgrade      = true
      ~ availability_zone               = "eu-central-1b" -> (known after apply)
        cluster_identifier              = "customerscoringunittest-cluster"
        copy_tags_to_snapshot           = true
        db_parameter_group_name         = "customerscoringqa-aurora-mysql57"
        db_subnet_group_name            = "privat"
      ~ dbi_resource_id                 = "db-H54JTW27MWJTJTPUJVNLTXEH7I" -> (known after apply)
      ~ endpoint                        = "customerscoringunittest.co4pdundcaoq.eu-central-1.rds.amazonaws.com" -> (known after apply)
        engine                          = "aurora-mysql"
      ~ engine_version                  = "5.7.mysql_aurora.2.04.6" -> "5.7.mysql_aurora.2.05.0" # forces replacement
      ~ id                              = "customerscoringunittest" -> (known after apply)
        identifier                      = "customerscoringunittest"
      + identifier_prefix               = (known after apply)
        instance_class                  = "db.t3.medium"
      ~ kms_key_id                      = "arn:aws:kms:eu-central-1:XXXXXXXXXX:key/0000000-1111-acbb-80e5-1fb4254b6666" -> (known after apply)
        monitoring_interval             = 0
      + monitoring_role_arn             = (known after apply)
      ~ performance_insights_enabled    = false -> (known after apply)
      + performance_insights_kms_key_id = (known after apply)
      ~ port                            = 3306 -> (known after apply)
      ~ preferred_backup_window         = "21:43-22:43" -> (known after apply)
      ~ preferred_maintenance_window    = "mon:02:32-mon:03:02" -> (known after apply)
        promotion_tier                  = 1
        publicly_accessible             = false
      ~ storage_encrypted               = true -> (known after apply)
      ~ writer                          = true -> (known after apply)
    }

@jcarlson
Copy link

jcarlson commented Nov 8, 2019

Try ignoring changes to the engine version on the aws_rds_cluster_instance resource.

resource "aws_rds_cluster" "main" {
  apply_immediately  = true
  cluster_identifier = "my-cluster"
  engine             = "aurora-postgresql"
  engine_version     = "10.7"

  # other attributes omitted
}

resource "aws_rds_cluster_instance" "cluster_instance" {
  apply_immediately  = true
  identifier_prefix  = "my-instance"
  cluster_identifier = aws_rds_cluster.main.id
  engine             = aws_rds_cluster.main.engine
  engine_version     = aws_rds_cluster.main.engine_version

  # other attributes omitted

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [engine_version]
  }
}

My simple experimentation shows that when you terraform apply an engine version change on the cluster resource, AWS upgrades the cluster instances at the same time, thus negating the need to update the cluster instances with Terraform.

Note that I've also marked the cluster instances as create_before_destroy, so that if Terraform does insist on replacing the instance, it will spin up a replacement instance first and this will minimize downtime.

@brianmori
Copy link

AWS Provider: 2.38
Terraform: 0.12.13

We have the same issue with aurora, but the instances once destroyed they cannot be recreated

module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Destroying... [id=abc-dev-0]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Destroying... [id=abc-dev-1]
module.abc-eks-customer-quality.aws_launch_configuration.workers[1]: Creating...
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifying... [id=abc-dev]
module.abc-eks-customer-quality.aws_launch_configuration.workers[0]: Creating...
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifications complete after 1m14s [id=abc-dev]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m20s elapsed]

Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
	status code: 400, request id: 6b737240-b302-4ac6-b632-ae9a1e632960

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {



Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
	status code: 400, request id: 4c6ee853-ad2f-4706-ade7-d2a8968d4f98

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {


@pioneer2k
Copy link

As @jcarlson already wrote the solution is to work with the engine_version on the cluster only and leave the engine_version on the cluster_instance out, since it is optional. When doing so, Terraform does an inplace upgrade of the cluster and AWS RDS upgrades the cluster_instance's itself. Terraform then sees no difference on the cluster_instance and does nothing.

@marinsalinas
Copy link

@pioneer2k have you tried that with global_clusters? I tried but seems like on rds_global_cluster we need to specify the same version of the rds_clusters.

@marinsalinas
Copy link

@nywilken this issue is related to service/rds not to service/dynamodb

@maryelizbeth maryelizbeth added needs-triage Waiting for first response or review from a maintainer. service/rds Issues and PRs that pertain to the rds service. and removed service/dynamodb Issues and PRs that pertain to the dynamodb service. labels Aug 13, 2020
@maryelizbeth maryelizbeth added enhancement Requests to existing resources that expand the functionality or scope. and removed needs-triage Waiting for first response or review from a maintainer. labels Sep 2, 2020
@AndrewAyush
Copy link

Hello Guys, I have created a template
resource "aws_rds_cluster" "default" {
cluster_identifier = var.name
engine = "aurora-mysql"
engine_mode = "serverless"
engine_version = "5.7.mysql_aurora.2.07.1"
availability_zones = ["us-east-2a", "us-east-2b"]
master_username = var.database_username
master_password = var.database_password
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
preferred_backup_window = "07:00-09:00"
db_subnet_group_name = aws_db_subnet_group.rds.id
db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.aurora_db_57.id
final_snapshot_identifier = "${var.name}-final"
skip_final_snapshot = false
deletion_protection = true
apply_immediately = true
scaling_configuration {
min_capacity = var.min_capacity
auto_pause = true
max_capacity = var.max_capacity
seconds_until_auto_pause = 300
timeout_action = "ForceApplyCapacityChange"
}

if I change anything in this template, it will delete the rds and create it again. is there a way where we can only modifying the rds instead of deleting?

@bill-rich
Copy link
Contributor

I was not able to reproduce this issue. In all the cases I tried, upgrades worked fine as long as the engine version was managed in aws_rds_cluster rather than in aws_rds_cluster_instance. I'm going to close this issue since I can't reproduce it. If anyone has an example config and steps to reproduce it, please post them and I'll get the issue reopened.

@marinsalinas
Copy link

@bill-rich quick question, that means that we can omit the engine_version for rds_cluster_instance and only manage this parameter on the rds_cluster level?

Also what about using a global cluster? every engine_version change requires recreation when uses a global cluster: https://github.com/hashicorp/terraform-provider-aws/blob/main/aws/resource_aws_rds_global_cluster.go#L61

We can manage this with a similar approach?

@bill-rich
Copy link
Contributor

Hi @marinsalinas! That is correct on only including engine_version in the rds_cluster config. rds_global_cluster still requires a full destroy and create for engine_version updates, but it looks like the API does support updating. The global_cluster resource will need more changes and testing to get into a state where this can be supported. I will be looking into this over the next couple days. I'll track the progress in #18214.

@ghost
Copy link

ghost commented Apr 10, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Apr 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/rds Issues and PRs that pertain to the rds service.
Projects
None yet
Development

No branches or pull requests

10 participants