Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-cdk-lib/aws-stepfunctions: incorrect resource in ECS ecs:RunTask for State Machines #30751

Closed
Assignees
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container bug This issue is a bug. effort/small Small work item – less than a day of effort p2

Comments

@nicor88
Copy link

nicor88 commented Jul 4, 2024

Describe the bug

The policy generated in case of ECS tasks trigger in a state machine are of this type:

{
    "Action": "ecs:RunTask",
	"Resource": [
		"arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task:*",
		"arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task"
	],
	"Effect": "Allow"
},

the Resource "arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task" is not a valid one, the policy validator fail in the UI (even if I'm able to deploy) and there is an AWS notification about my state machine role.

Expected Behavior

The resource used for ecs:RunTask is simply:
"arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task:*",

Current Behavior

the resources for ecs:RunTask are:

  • "arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task:*",
  • "arn:aws:ecs:eu-central-1:xxxx:task-definition/my-task"

Reproduction Steps

Create a state machine invoking an ecs task

Possible Solution

Simply remove the not necessary resource from then policy attach to the IAM role used by the statemachine.
Creating a role, and passing to sfn.StateMachine doesn't help, because an inline policy with the wrong inline policy is attach to the custom role.

Additional Information/Context

No response

CDK CLI Version

2.147.3

Framework Version

No response

Node.js Version

v18.0.0

OS

MacOs

Language

TypeScript

Language Version

No response

Other information

No response

@nicor88 nicor88 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 4, 2024
@github-actions github-actions bot added the @aws-cdk/aws-ecs Related to Amazon Elastic Container label Jul 4, 2024
@ashishdhingra ashishdhingra self-assigned this Jul 5, 2024
@ashishdhingra ashishdhingra added needs-reproduction This issue needs reproduction. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 5, 2024
@ashishdhingra
Copy link
Contributor

ashishdhingra commented Jul 5, 2024

Reproducible using below CDK code:

import * as cdk from 'aws-cdk-lib';
import * as tasks from 'aws-cdk-lib/aws-stepfunctions-tasks';
import * as sfn from 'aws-cdk-lib/aws-stepfunctions';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ecs from 'aws-cdk-lib/aws-ecs';

export class StepStack extends cdk.Stack {
  constructor(app: cdk.App, id: string, props: cdk.StackProps) {
    super(app, id, props);

    const vpc = ec2.Vpc.fromLookup(this, 'Vpc', {
      isDefault: true,
    });

    const cluster = new ecs.Cluster(this, 'Ec2Cluster', { vpc });

    const taskDefinition = new ecs.FargateTaskDefinition(this, 'TD', {
      cpu: 256,
      memoryLimitMiB: 512,
    });

    taskDefinition.addContainer('TheContainer', {
      image: ecs.ContainerImage.fromRegistry('public.ecr.aws/docker/library/busybox:unstable-uclibc'),
      memoryLimitMiB: 256,
      command: ['sh', '-c', 'ping google.com -c 2'],
      logging: new ecs.AwsLogDriver({
        streamPrefix: 'demo',
      }),
    });

    const runTask = new tasks.EcsRunTask(this, 'Run', {
      integrationPattern: sfn.IntegrationPattern.RUN_JOB,
      cluster,
      taskDefinition,
      assignPublicIp: true,
      launchTarget: new tasks.EcsFargateLaunchTarget(),
    });

    const startState = new sfn.Pass(this, 'StartState');
    const definition = startState
      .next(runTask);

    new sfn.StateMachine(this, 'StateMachine', {
      definition,
      timeout: cdk.Duration.minutes(5),
    });
  }
}

It generates State Machine role with default policy as shown below (clearly flagging Invalid ARN Resource error):
Screenshot 2024-07-05 at 2 30 33 PM

Most likely an issue here.

@ashishdhingra ashishdhingra added p2 effort/small Small work item – less than a day of effort and removed needs-reproduction This issue needs reproduction. labels Jul 5, 2024
@ashishdhingra ashishdhingra removed their assignment Jul 5, 2024
@msambol
Copy link
Contributor

msambol commented Jul 7, 2024

I believe I fixed this in #30389, but we decided to leave the original permission and add a FF. @GavinZZ, shall I remove the FF and default it to only be ${taskDefinitionFamilyArn}:*?

@nicor88
Copy link
Author

nicor88 commented Jul 8, 2024

@ashishdhingra thanks for the snippet.

@msambol default to ${taskDefinitionFamilyArn}:*should do the job. But pretty much the recap of this issue, is that ${taskDefinitionFamilyArn} resource should go away from the policy to avoid:

  • validation issues in the UI
  • AWS to raise some notification errors in the AWS notification center.

@GavinZZ
Copy link
Contributor

GavinZZ commented Jul 23, 2024

Sorry for the delayed response. That sounds like the right thing to do. Before approving the PR, I want to quickly double check the following:

  1. This incorrect policy seems to be deployable but once deployed the policy validator would show an error?
  2. Does the incorrect policy stop the state machine from execution or what's the error you see?

@GavinZZ GavinZZ self-assigned this Jul 23, 2024
@nicor88
Copy link
Author

nicor88 commented Jul 23, 2024

  1. Correct. The policy is deployable but the validator highlights the error in the aws iam ui.
  2. There are no execution issues. The container can be started properly because the policy contains ${taskDefinitionFamilyArn}:*

@GavinZZ
Copy link
Contributor

GavinZZ commented Jul 23, 2024

Thanks for the clarification. I approved the PR by @msambol.

@mergify mergify bot closed this as completed in #30788 Jul 23, 2024
@mergify mergify bot closed this as completed in 82b163d Jul 23, 2024
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

1 similar comment
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

obraafo pushed a commit to obraafo/aws-cdk that referenced this issue Jul 25, 2024
### Issue # (if applicable)

Closes aws#30751.

### Reason for this change

`runTask` on `${taskDefinitionFamilyArn}` is no longer relevant (see validation errors in the linked issue.
This was currently disabled with a FF. This PR removes the permission entirely, and removes the FF.

### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@aws-cdk-automation
Copy link
Collaborator

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.