Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky bug: error reading S3 bucket server-side encryption configuration #24232

Closed
Veetaha opened this issue Apr 14, 2022 · 1 comment · Fixed by #24266
Closed

Flaky bug: error reading S3 bucket server-side encryption configuration #24232

Veetaha opened this issue Apr 14, 2022 · 1 comment · Fixed by #24266
Labels
bug Addresses a defect in current functionality. service/s3 Issues and PRs that pertain to the s3 service.

Comments

@Veetaha
Copy link
Contributor

Veetaha commented Apr 14, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

  • terraform CLI: v1.1.7
  • AWS provider: v4.8.0

Affected Resource(s)

  • aws_s3_bucket_server_side_encryption_configuration

Terraform Configuration Files

Note that various S3 configuration resources all have explicit depends_on clauses to workaround different kinds of bugs similar to #7628 where deployment may randomly fail due to concurrent updates of s3 bucket configuration...

Spoiler much code
resource "aws_s3_bucket" "bucket" {
  bucket        = "bucket-name"
  force_destroy = true

  tags = {
      "elastio:resource" = true,
      "elastio:vault" = "bar"
  }
}

resource "aws_s3_bucket_acl" "vault" {
  bucket = aws_s3_bucket.bucket.id
  acl    = "private"
  depends_on = [aws_s3_bucket_versioning.vault]
}

resource "aws_s3_bucket_versioning" "vault" {
  bucket = aws_s3_bucket.bucket.id
  versioning_configuration {
    status = "Enabled"
  }

  depends_on = [aws_s3_bucket_server_side_encryption_configuration.vault]
}

resource "aws_s3_bucket_server_side_encryption_configuration" "vault" {
  bucket = aws_s3_bucket.bucket.bucket

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
    bucket_key_enabled = true
  }

  depends_on = [aws_s3_bucket_lifecycle_configuration.vault]
}

resource "aws_s3_bucket_lifecycle_configuration" "vault" {
  bucket = aws_s3_bucket.bucket.id

  dynamic "rule" {
    // { We have a complex bucket lifecycle policy, I don't think it is relevant, though I can provide you with more code if needed }
    for_each = local.fixed_rules
    content {
      id = rule.value["id"]
      filter {
        prefix = rule.value["prefix"]
      }

      status = "Enabled"

      noncurrent_version_expiration {
        noncurrent_days = 1
      }
    }
  }

  dynamic "rule" {
    for_each = local.fixed_meta_rules

    content {
      status = "Enabled"
      id     = rule.value["id"]
      filter {
        prefix = rule.value["prefix"]
      }

      noncurrent_version_expiration {
        noncurrent_days = 365000000
      }
    }
  }

  # Variable data rules
  dynamic "rule" {
    for_each = local.ae_rules

    content {
      status = "Enabled"
      id     = rule.value["id"]
      filter {
        prefix = rule.value["prefix"]
      }

      noncurrent_version_expiration {
        noncurrent_days = 1
      }
    }
  }

  dynamic "rule" {
    for_each = local.ae_meta_rules

    content {
      status = "Enabled"
      id     = rule.value["id"]
      filter {
        prefix = rule.value["prefix"]
      }

      noncurrent_version_expiration {
        noncurrent_days = 365000000
      }
    }
  }

  depends_on = [aws_s3_bucket_policy.bucket_policy]
}

data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "bucket" {
  statement {
    sid       = "Secure access only"
    effect    = "Deny"
    actions   = ["s3:*"]
    resources = [aws_s3_bucket.bucket.arn]

    principals {
      type        = "AWS"
      identifiers = ["${data.aws_caller_identity.current.arn}"]
    }

    condition {
      test     = "Bool"
      variable = "aws:SecureTransport"
      values   = [false]
    }
  }
}

resource "aws_s3_bucket_policy" "bucket_policy" {
  bucket = aws_s3_bucket.bucket.id

  # Prohibit any access over a non-TLS channel
  policy = data.aws_iam_policy_document.bucket.json

  depends_on = [aws_s3_bucket_public_access_block.public]
}

resource "aws_s3_bucket_public_access_block" "public" {
  bucket = aws_s3_bucket.bucket.id
  block_public_acls   = true
  block_public_policy = true
}

Debug Output

Unfortunately, there isn't one available. This bug randomly reproduces on our CI, where we don't have verbose debug logging enabled because it produces too much output...

Expected Behavior

There must be no problems deploying a bucket with server side encryption...

Actual Behavior

We run a lot of deployments with terraform on our CI every day, and see a lot of flakiness like this.
In this case it is an error deploying the bucket with SSE config resource described above with the error on CI:

  ╷
  │ Error: error reading S3 bucket server-side encryption configuration (elastio-exp-1649911532-psnnkc-vault-bff3f6-u6r9c00nq): ServerSideEncryptionConfigurationNotFoundError: The server side encryption configuration was not found
  │ 	status code: 404, request id: PBYFA81CEDF1PQJG, host id: JwvlvR5ld1WsnoO64BLhlD250ekj1tGDFPSIvldsBfqNT+Iaub8mJpuM0ArmIu135iGO44YzdXs=
  │ 
  │   with module.vault.module.delete_protected.aws_s3_bucket_server_side_encryption_configuration.vault,
  │   on ../../../modules/vault/inner/delete_protected/main.tf line 46, in resource "aws_s3_bucket_server_side_encryption_configuration" "vault":
  │   46: resource "aws_s3_bucket_server_side_encryption_configuration" "vault" {
  │ 
  ╵

It reproduces very randomly, that's why this is a flaky bug...

Steps to Reproduce

Do a fresh deployment from scratch (non-incremental),.

  1. terraform plan (not sure it influences anything though)
  2. terraform apply
@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/iam Issues and PRs that pertain to the iam service. service/s3 Issues and PRs that pertain to the s3 service. service/sts Issues and PRs that pertain to the sts service. labels Apr 14, 2022
@ewbankkit ewbankkit removed service/iam Issues and PRs that pertain to the iam service. service/sts Issues and PRs that pertain to the sts service. labels Apr 14, 2022
@justinretzolk justinretzolk added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Apr 14, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/s3 Issues and PRs that pertain to the s3 service.
Projects
None yet
3 participants