Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ecs): potential race condition on TaskRole default policy update with CfnService #24999

Closed
wants to merge 5 commits into from

Conversation

lpizzinidev
Copy link
Contributor

Prevents potential race conditions on TaskRole default policy update in EC2 and Fargate services by adding a dependency on the TaskRole.
This will update the TaskRole and its children first and the service after.

Closes #24880.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added bug This issue is a bug. effort/small Small work item – less than a day of effort p1 labels Apr 8, 2023
@aws-cdk-automation aws-cdk-automation requested a review from a team April 8, 2023 07:27
@github-actions github-actions bot added the valued-contributor [Pilot] contributed between 6-12 PRs to the CDK label Apr 8, 2023
@@ -561,6 +561,8 @@ export abstract class BaseService extends Resource
...additionalProps,
});

this.node.addDependency(this.taskDefinition.taskRole);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the linked issue it looks like we should be adding a dependency on the Policy not the role. There should already be an implicit dependency on the role and in cases where the policy is updated the role itself will probably not be updated so will not be in the dependency chain.

Copy link
Contributor Author

@lpizzinidev lpizzinidev Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

The default policy is a private property in the role.
Should I create a public getter to retrieve it and add a dependency with:

this.node.addDependency(this.taskDefinition.taskRole.getDefaultPolicy());

?

Copy link

@iancaffey iancaffey Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the L2 Lambda Function construct does a similar thing (adding the dependency on the Role).

Seems pretty harmless to add the dependency on the task role to catch all its children too (the Role will get an implicit dependency for itself anyways from being in the task definition passed to the service).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think dependencies work that way. If I declare an explicit dependency on resource A, CloudFormation does not view that as adding an explicit dependency on every resource that resource A depends on. It only depends on resource A, so if resource A is not updated then it doesn't have to wait.

But we can always run tests to confirm :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to accomplish this globally though if we can assume that anytime someone adds a dependency on a role that they also want to add a dependency on the default policy (I think that is a safe assumption)

I think by adding something like this to Role

Dependable.implement(this, {
  dependencyRoots: [this, this.defaultPolicy],
});

Whenever you do addDependency(role) it will also add a dependsOn for the policy.

Copy link

@iancaffey iancaffey Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think dependencies work that way. If I declare an explicit dependency on resource A, CloudFormation does not view that as adding an explicit dependency on every resource that resource A depends on. It only depends on resource A, so if resource A is not updated then it doesn't have to wait.

For sure, but I'm only talking about child constructs of the Role. The default policy is scoped to the role, so addDependency(role) takes care of it.

FWIW, this fix worked for us across all our services. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah you are right. @lpizzinidev can you add a test that asserts the dependency is added on the policy as well?

@corymhall corymhall self-assigned this Apr 11, 2023
@aws-cdk-automation
Copy link
Collaborator

This PR cannot be merged because it has conflicts. Please resolve them. The PR will be considered stale and closed if it remains in an unmergeable state.

1 similar comment
@aws-cdk-automation
Copy link
Collaborator

This PR cannot be merged because it has conflicts. Please resolve them. The PR will be considered stale and closed if it remains in an unmergeable state.

@gitpod-io
Copy link

gitpod-io bot commented Apr 27, 2023

@mergify mergify bot dismissed corymhall’s stale review April 27, 2023 06:32

Pull request has been modified.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 6a8e4c7
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository


// THEN
Template.fromStack(stack).hasResource('AWS::ECS::Service', {
DependsOn: ['FargateTaskDefTaskRole0B257552'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DependsOn should also contain the policy resource.

@aws-cdk-automation
Copy link
Collaborator

This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@aws-cdk-automation
Copy link
Collaborator

This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error.

@aws-cdk-automation aws-cdk-automation added the closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. label Jun 21, 2023
@iancaffey
Copy link

@lpizzinidev @corymhall any chance we can get this PR across the finish line? :)

@lpizzinidev
Copy link
Contributor Author

@iancaffey @corymhall
I was looking into this, the problem is that for FargateService the default policy is not added to the dependencies array.
Could it be that the construct is not creating a default policy?
I see that defaultPolicy is initialized in the addToPrincipalPolicy but I can't find a call to the method in FargateService.
I tried adding the dependency manually in the Role as suggested here without luck (also, this step should be unnecessary as reported here).

@iancaffey
Copy link

You'll see a lot of taskDefinition.addToTaskRolePolicy(...). Or callers might directly reference taskDefinition.taskRole when adding grants.

But yeah, the default props don't end up triggering adding any permissions that would end up initializing the policy.

Could you manually add task role permissions in the unit test to trigger initializing the policy?

@lpizzinidev
Copy link
Contributor Author

@iancaffey
Thanks for clarifying.
I've opened another PR with the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. effort/small Small work item – less than a day of effort p1 valued-contributor [Pilot] contributed between 6-12 PRs to the CDK
Projects
None yet
4 participants