Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

certificatemanager: deletion of stack with Cognito custom domain fails on CertificateRequestorResource #28063

Open
metametadata opened this issue Nov 18, 2023 · 1 comment
Labels
@aws-cdk/aws-certificatemanager Related to Amazon Certificate Manager bug This issue is a bug. effort/medium Medium work item – several days of effort p3

Comments

@metametadata
Copy link

metametadata commented Nov 18, 2023

Describe the bug

Deleting the stack with a custom domain for a Cognito user pool fails on trying to delete CertificateRequestorResource.

Expected Behavior

Deletion succeeds.

Current Behavior

Deletion fails with message:

DELETE_FAILED | AWS::CloudFormation::CustomResource | xxxxxxxxxx/CertificateRequestorResource/Default (yyyyyyyyCertificateRequestorResourceF53AA380) Received response status [FAILED] from custom resource. Message returned: Response from describeCertificate did not contain an empty InUseBy list after 10 attempts.

Reproduction Steps

Deploy the stack which has a Cognito user pool with a custom domain.

Setting such a stack requires defining a certificate for the custom domain. I do it using DnsValidatedCertificate, my code (in Clojure with custom helper functions):

user-pool (-> (UserPool$Builder/create stack "user-pool")
              ...
              (.userPoolName user-pool-name)
              .build)

; Cognito requires the parent domain to have a valid DNS A record.
; The parent may be the root of the domain, or a child domain that is one step up in the domain hierarchy.
; For example, if your custom domain is auth.xyz.example.com,
; Cognito must be able to resolve xyz.example.com to an IP address.
;
; The record points "nowhere", https://stackoverflow.com/questions/51249583.
apex (dns/domain user-pool-name "foo.com")
_ (cdk.route53/add-a-record stack apex (RecordTarget/fromIpAddresses (into-array ["127.0.0.1"])))

user-pool-domain (dns/domain "auth" apex)
hosted-zone (cdk.route53/memoized-fetch-hosted-zone stack user-pool-domain)
cert (-> (DnsValidatedCertificate$Builder/create stack (str user-pool-domain "-cert"))
         (.domainName user-pool-domain)
         (.hostedZone hosted-zone)
         (.region (str Region/US_EAST_1)) ; This region is required by Cognito
         .build)
domain (.addDomain user-pool "domain" (-> (UserPoolDomainOptions/builder)
                                          (.customDomain (-> (CustomDomainOptions/builder)
                                                             (.certificate cert)
                                                             (.domainName user-pool-domain)
                                                             .build))
                                          .build))
_ (cdk.route53/add-a-record stack user-pool-domain (RecordTarget/fromAlias (UserPoolDomainTarget. domain)))

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.100.0 (build e1b5c77)

Framework Version

2.100.0

Node.js Version

18.17.1

OS

macOS

Language

Java

Language Version

Java (17)

Other information

Cause

The cause seems to be that the certificate is still used by the "phantom" CloudFront distribution which belongs to the unknown account 455458493081 and I can't find it anywhere in the GUI.
It can be seen in the ACM GUI or via aws acm describe-certificate --certificate-arn ... --region us-east-1 and then looking at InUseBy key.

After a few minutes this dependency is automatically cleaned and the repeated attempt to delete the stack will succeed after that.

I suspect this is the distribution containing the Cognito's hosted UI website.

I found a single mention of the similar Cognito problem in https://stackoverflow.com/questions/75134728/phantom-cloudfront-distribution-blocks-me-from-creating-cognito-custom-domain. And the answer there states:

The CloudFront distribution that AWS creates for the custom Cognito domain will be removed in a few hours after you delete the user pool (or delete the custom domain via the Cognito console / API). This seems to be completely hidden from the user (you).

But there are several reports of a similar issue with certificates for API Gateway, e.g.:

Workaround attempt

I tried to retain the certificate on deletion via (.applyRemovalPolicy cert RemovalPolicy/RETAIN_ON_UPDATE_OR_DELETE). This allows the stack deletion to succeed. But when I deployed the same stack again immediately it failed with:

user-pool/domain (userpooldomainB4026A3C) One or more of the CNAMEs you provided are already associated with a different resource. (Service: AmazonCloudFront; Status Code: 409; Error Code: CNAMEAlreadyExists; Request ID: 6e3993dc-5cb7-4d0d-a267-ed58ca49dee3; Proxy: null) (Service: AWSCognitoIdentityProviderService; Status Code: 400; Error Code: InvalidParameterException; Request ID: 562d0982-bc92-4923-93d1-55732114571f; Proxy: null)

Strangely, deploying one more time succeeded. But in any case, it doesn't seem to be a reliable workaround and with time will pollute ACM with unused certificates.

Solution ideas

  1. The ideal solution is to fix it somewhere in CloudFront or Cognito. So that deletion of the pool immediately cleans the corresponding certificate InUseBy array.
  2. The solution in CDK could be to increase the number of attempts in aws-certificatemanager/dns-validated-certificate-handler deleteCertificate function:
    const deleteCertificate = async function (arn, region, hostedZoneId, route53Endpoint, cleanupRecords) {
@metametadata metametadata added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 18, 2023
@github-actions github-actions bot added the @aws-cdk/aws-certificatemanager Related to Amazon Certificate Manager label Nov 18, 2023
@pahud
Copy link
Contributor

pahud commented Nov 21, 2023

Thank you for the detailed report and suggested solution ideas.

@pahud pahud added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Nov 21, 2023
@pahud pahud added p3 and removed p2 labels Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-certificatemanager Related to Amazon Certificate Manager bug This issue is a bug. effort/medium Medium work item – several days of effort p3
Projects
None yet
Development

No branches or pull requests

2 participants