Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CDK Custom Resources #109

Closed
Tracked by #111
corymhall opened this issue Apr 26, 2024 · 0 comments · Fixed by #190
Closed
Tracked by #111

Support CDK Custom Resources #109

corymhall opened this issue Apr 26, 2024 · 0 comments · Fixed by #190
Assignees
Labels
kind/enhancement Improvements or new features resolution/fixed This issue was fixed size/L Estimated effort to complete (up to 10 days).
Milestone

Comments

@corymhall
Copy link
Contributor

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

CDK makes heavy use of CloudFormation custom resources to fill missing gaps in CloudFormation. These custom resources are a CloudFormation resource that is backed by a user defined (and managed) AWS Lambda function.

Since these Lambda functions contain custom code written by the CDK team/contributors, we need to find a way to reuse this code as we do not want to have to re-write it ourselves. My initial idea is to replace the custom resources with a new custom resource component that will handle invoking the lambda function and parsing the output.

Affected area/feature

Similar to #60 which is for a specific custom resource. This issue is for tracking general support for custom resources.

@corymhall corymhall added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team size/L Estimated effort to complete (up to 10 days). and removed needs-triage Needs attention from the triage team labels Apr 26, 2024
@lukehoban lukehoban mentioned this issue May 18, 2024
6 tasks
@mjeffryes mjeffryes assigned corymhall and flostadler and unassigned corymhall Oct 1, 2024
@mjeffryes mjeffryes added this to the 0.111 milestone Oct 2, 2024
@t0yv0 t0yv0 assigned t0yv0 and flostadler and unassigned flostadler and t0yv0 Oct 28, 2024
@mjeffryes mjeffryes modified the milestones: 0.111, 0.112 Oct 30, 2024
flostadler added a commit to pulumi/pulumi-aws-native that referenced this issue Nov 5, 2024
…1795)

Previously it was assumed that the custom resources do not concern
themselves with secretness or handle timeouts.
But this is necessary to support CloudFormation based Custom Resources.
Those can contain secret outputs and support configuring timeouts.

This change modifies the `CustomResource` interface to accommodate those
requirements.
In detail this means that the Create/Update/Read lifecycle methods now
returns a `PropertyMap` instead of a generic `map[string]interface{}`
and take a timeout
parameter.

For reviewing this change, I'd first recommend having a look at the
changes to the `CustomResource` interface in
`provider/pkg/resources/custom.go` and then double check the refactoring
changes resulting in that.

This change also introduces tests for the provider's CRUD lifecycle. As
part of doing that, I added mocks using `uber/gomock`.

Relates to pulumi/pulumi-cdk#109
t0yv0 pushed a commit to pulumi/pulumi-aws-native that referenced this issue Nov 6, 2024
…1801)

Previously the Update and Delete methods for Custom Resources
did not require access to the current state, but this is a
common requirement for updating and deleting resources.

This is also needed for supporting CloudFormation Custom Resources
as those require access to state in order to complete updates
and deletions.

relates to pulumi/pulumi-cdk#109
flostadler added a commit to pulumi/pulumi-aws-native that referenced this issue Nov 8, 2024
This change adds the necessary AWS SDK clients (S3 and Lambda) for
supporting CFN Custom Resources in aws-native.

Relates to pulumi/pulumi-cdk#109
flostadler added a commit to pulumi/pulumi-aws-native that referenced this issue Nov 8, 2024
This PR adds support for [CloudFormation Custom
Resource](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources.html)
to the aws-native provider. It implements an emulator that enables
Pulumi programs to interact with Lambda-backed CloudFormation Custom
Resources.

A CloudFormation custom resource is essentially an extension point to
run arbitrary code as part of the CloudFormation lifecycle. It is
similar in concept to the [Pulumi Command
Provider](https://www.pulumi.com/registry/packages/command/), the
difference being that CloudFormation CustomResources are executed in the
Cloud; either through Lambda or SNS.

For the first implementation we decided to limit the scope to Lambda
backed Custom Resources, because the SNS variants are not widely used.

## Custom Resource Protocol
The implementation follows the CloudFormation Custom Resource protocol.
I derived the necessary parts by combining information from the
[docs](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/crpg-ref.html),
[CDKs CustomResource
Framework](https://github.com/aws/aws-cdk/tree/main/packages/%40aws-cdk/custom-resource-handlers/lib/custom-resources-framework)
and trial&error.

Notable aspects of that protocol are:
- primitive properties need to be string encoded when sending them to
Custom Resource handlers. This includes deeply nested properties:
aws-cloudformation/cloudformation-coverage-roadmap#1037
- The Lambda Function is invoked asynchronously. Lambda will retry the
execution if the function fails unexpectedly (e.g. unhandled exception).
- Due to the async invocation, the response is not returned from the
Lambda Function, instead it's sent to a `ResponseURL` that needs to be
included in the request payload.
- Similarly to CloudFormation, we decided to implement this using S3
Buckets and presigned URLs.

### Custom Resource Lifecycle
```mermaid
sequenceDiagram
    participant A as aws-native
    participant S3 as S3 Bucket
    participant L as Lambda
    
    %% Create Flow
    Note over A,L: Create Operation
    A->>S3: Generate presigned URL
    A->>L: Invoke with CREATE event
    activate L
    loop Until response found or timeout
        A->>S3: Poll for response
        L-->>S3: Upload response
    end
    deactivate L
    A->>S3: Fetch response
    alt Success
        A->>A: Store PhysicalId & outputs
    else Failure
        A->>A: Return error
    end

    %% Update Flow
    Note over A,L: Update Operation
    A->>S3: Generate presigned URL
    A->>L: Invoke with UPDATE event
    activate L
    loop Until response found or timeout
        A->>S3: Poll for response
        L-->>S3: Upload response
    end
    deactivate L
    A->>S3: Fetch response
    alt Success
        A->>A: Check PhysicalId
        alt ID Changed
            A->>S3: Generate presigned URL for cleanup
            A->>L: Invoke with DELETE event for old resource
            activate L
            loop Until cleanup response found or timeout
                A->>S3: Poll for cleanup response
                L-->>S3: Upload cleanup response
            end
            deactivate L
            A->>S3: Fetch cleanup response
        end
    else Failure
        A->>A: Return error
    end

    %% Delete Flow
    Note over A,L: Delete Operation
    A->>S3: Generate presigned URL
    A->>L: Invoke with DELETE event
    activate L
    loop Until response found or timeout
        A->>S3: Poll for response
        L-->>S3: Upload response
    end
    deactivate L
    A->>S3: Fetch response
    alt Success
        A->>A: Return success
    else Failure
        A->>A: Return error
    end
```

## Reviewer Notes

Key areas to review:
1. Error handling in the response collection mechanism
2. Timeout management, especially for the `Update` lifecycle
3. Documentation completeness and accuracy

Exposing this resource and schematizing it is part of this PR
#1807.
Automatically cleaning up the response objects is not included in this
PR in order to keep its size manageable. Implementing this is tracked
here: #1813.

Please pay special attention to:
- S3 response collection mechanism security
- State management during updates
- Cleanup handling when physical resource IDs change

## Testing
- Unit tests including error handling tests for various failure
scenarios
- Integration tests with actual Lambda functions are added in this
stacked PR: #1807

## Related Issues
- pulumi/pulumi-cdk#109
- #1812
- #1813
@mjeffryes mjeffryes modified the milestones: 0.112, 0.113 Nov 13, 2024
flostadler added a commit that referenced this issue Nov 15, 2024
This PR adds support for [CloudFormation Custom
Resource](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources.html)
to pulumi-cdk. It implements does so by using the
`CustomResourceEmulator` resource from
[aws-native](https://www.pulumi.com/registry/packages/aws-native/api-docs/cloudformation/customresourceemulator/).

For the first implementation we decided to limit the scope to Lambda
backed Custom Resources, because the SNS variants are not widely used.

I'd recommend reviewing in this order:
- `src/graph.ts`and `src/converters/app-converter.ts`. The changes in
these files ensure that Custom Resources get correctly parsed and other
resources can reference their attributes with the `GetAtt` intrinsic
- `src/cfn-resource-mappings.ts`: This constructs the
`CustomResourceEmulator` based on the CDK inputs while re-using the
staging bucket to store the CustomResource responses.
- unit & integration tests

**Noteworthy**:
I added a temporary workaround for shortening the resource names until
pulumi/pulumi-aws-native#1816 is resolved. It
can be toggled on by setting the
`PULUMI_CDK_EXPERIMENTAL_MAX_NAME_LENGTH` env variable. Without this
none of the CustomResources worked because they have deeply nested
Lambdas and IAM roles. Those resources have a max name limit of 64.

Closes #109
Closes #60
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features resolution/fixed This issue was fixed size/L Estimated effort to complete (up to 10 days).
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants