-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(codepipeline): large cross-region pipelines exceed IAM policy size limit #16350
Conversation
0416e68
to
d12193d
Compare
Hey @rix0rrr @njlynch! This is a different approach to the "too large IAM policies" problem that we're aware of. I understand we're hesitant to explore this area by doing too much of statement deduplication (see issue #14713) because of previous bad experiences, but I hope this PR presents an "80/20" solution: solving 80% of the cases we have of too large policies (that we don't de-duplicate based on the This new Let me know what you think - I'm eager to get your opinion on this solution! |
…size limit When we generate CodePipelines, we need to add an `sts:AssumeRole` statement for each Action in the pipeline, and a `Bucket.grantReadWrite()` statement for each region the pipeline is in, to the policy statement of the pipeline's Role. For pipelines with many Actions and/or regions, this makes the policy exceed IAM limit of 10240 bytes. Extract a new class from the CodePipeline CloudFormation Actions that caches the statements added to a given Principal by the 'Action' field, and groups the statements with the same 'Actions' by adding elements to the 'Resource' field. This dramatically reduces the duplication in the statement, and increases the chances of it being smaller than the limit. Use this new class in the `Pipeline` construct. Fixes aws#16244
d12193d
to
f6b47a5
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
I've labeled this as For what it's worth, I did some archeology, as the taboo around policy optimization predates my time on the team and I wanted to know more. Working from most recent to the original issue:
So it appears the Sun to our folklore Icarus was collapsing actions, and potentially also combining non-identical (but overlapping) statements. This more constrained approach -- only combining 100% identical statements that only differ by resources -- seems much less likely to cause unintentional bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, largely just missing tests.
In general, tests for the GroupingByActionsPrincipal
seems necessary. Additionally, I'm a bit surprised to see zero updates to any integ or unit tests for the updates to the Bucket permissions. Given that we're largely de-duping two sets of actions (sts:AssumeRole
and the bucket.grantReadWrite
calls), I expect to see some test somewhere showing that behavior. I would think (just based on filename) the integ.pipeline-cfn-cross-region.expected.json
test would have seen this update. Even a unit tests verifying the permissions would be useful.
"Arn" | ||
] | ||
} | ||
"Resource": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised by this change. What about the implementation is overriding the _norm
behavior of PolicyStatement.toStatementJson
?
aws-cdk/packages/@aws-cdk/aws-iam/lib/policy-statement.ts
Lines 326 to 356 in 2023004
public toStatementJson(): any { | |
return noUndef({ | |
Action: _norm(this.action, { unique: true }), | |
NotAction: _norm(this.notAction, { unique: true }), | |
Condition: _norm(this.condition), | |
Effect: _norm(this.effect), | |
Principal: _normPrincipal(this.principal), | |
NotPrincipal: _normPrincipal(this.notPrincipal), | |
Resource: _norm(this.resource, { unique: true }), | |
NotResource: _norm(this.notResource, { unique: true }), | |
Sid: _norm(this.sid), | |
}); | |
function _norm(values: any, { unique }: { unique: boolean } = { unique: false }) { | |
if (typeof(values) === 'undefined') { | |
return undefined; | |
} | |
if (cdk.Token.isUnresolved(values)) { | |
return values; | |
} | |
if (Array.isArray(values)) { | |
if (!values || values.length === 0) { | |
return undefined; | |
} | |
if (values.length === 1) { | |
return values[0]; | |
} |
My understanding is that any time we have a single resource, we remove the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! Yeah, this is surprising behavior.
Let me dig in a little bit more to see what happens here.
* thus compressing the overall size of the Policy attached the given Principal, | ||
* and making it less likely it hits the 10240 bytes IAM limit. | ||
*/ | ||
export class GroupingByActionsPrincipal extends Construct implements IPrincipal, IGrantable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!
Where are the tests for this new class?
const key = this.keyFor(statement); | ||
const statementCache = this.statements[key]; | ||
if (!statementCache) { | ||
const grantResult = this.wrappedIdentity.addToPrincipalPolicy(statement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that there are existing statements on the wrappedIdentity
that would be duplicated here, correct? If so, we simply wouldn't de-dupe those statements, but would any future statements that matched. If so, I think that's fair for now, but could be a future improvement.
} | ||
|
||
private keyFor(statement: PolicyStatement): string { | ||
const hashBuilder = crypto.createHash('sha256'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why even create a hash in the first place? We can just use the stringified JSON as the key
private readonly statements: { [key: string]: { statement: PolicyStatement, grantResult: AddToPrincipalPolicyResult } }; | ||
|
||
constructor(grantPrincipal: IIdentity, id: string) { | ||
super(grantPrincipal as unknown as Construct, id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cast looks icky. Why this?
Does GroupingByActionsPrincipal
need to be a Construct
in the first place?
this.statements[key] = { statement, grantResult }; | ||
return grantResult; | ||
} else { | ||
statementCache.statement.addResources(...statement.resources); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!
Since statementCache.statement
is the same object as the original PolicyStatement
given to you by the consumer, mutation will have unintended side effects:
const stmt = new PolicyStatement({ actions: ['s3:GetBanana'], resources: ['mybucket'] });
groupingByActionsPrincipal.addToPrincipalPolicy(stmt);
groupingByActionsPrincipal.addToPrincipalPolicy({ actions: ['s3:GetBanana'], resources: ['*'] });
stmt.addAction('aws:StealCredentials'); // Oopsepoops! End result is 'aws:StealCredentials' on '*' !
Only safe way around this is by making a copy of the incoming PolicyStatement
to prevent the mutable aliasing.
That may break the legitimate use cases of mutating the original policy after the fact...
So probably the only way to do this correctly is to do the merging at render time.
924c117
to
ebfd5f2
Compare
This is impacting us at NVIDIA too. Besides refactoring the pipeline role policy, will you be supporting importing an existing role or allowing us to pass an immutable policy? |
Yes, we already support that - you can simply pass your own immutable |
Hey, what is the workaround for CodePipeline (construct) How to override the role property in this case as it does not expose any role property? // Modern API |
The policies we generate sometimes have a lot of duplication between statements. This duplication can lead to the policy going over the size limit an IAM policy (either 2k, 6k or 10k bytes, depending on the resource type). This change combines multiple statements together, as long as it doesn't change the meaning of the final policy. Because doing so for all existing stacks will probably provoke minor heart attacks in operators everywhere, the new behavior is gated behind a feature flag. It can be retroactively switched on by people currently being bit by the size issues: ``` @aws-cdk/aws-iam:minimizePolicies ``` Fixes #18774, fixes #16350, fixes #18457.
It seems like #19114 might supersede this PR. It's being actively worked on. Just in case others are looking through the various issues related to this. |
The policies we generate sometimes have a lot of duplication between statements. This duplication can lead to the policy going over the size limit an IAM policy (either 2k, 6k or 10k bytes, depending on the resource type). This change combines multiple statements together, as long as it doesn't change the meaning of the final policy. Because doing so for all existing stacks will probably provoke minor heart attacks in operators everywhere, the new behavior is gated behind a feature flag. It can be retroactively switched on by people currently being bit by the size issues: ``` @aws-cdk/aws-iam:minimizePolicies ``` We will merge 2 statements if their effects are the same, and they are otherwise exactly the same apart from their `Action`, `Resource` or `Principal` declarations. We will not merge `NotXxx` statements, because doing so will change the meaning of the statement (`not A or not B ≠ not (A or B)`). There may be multiple possible merges that apply and we are not guaranteed to find the smallest merging, nor do we take effort to find all possible merges and do simplifications like `*`-subsumption. This is a starting point that should help out in the common case. Fixes #18774, fixes #16350, fixes #18457, fixes #18564, fixes #19276. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
The policies we generate sometimes have a lot of duplication between statements. This duplication can lead to the policy going over the size limit an IAM policy (either 2k, 6k or 10k bytes, depending on the resource type). This change combines multiple statements together, as long as it doesn't change the meaning of the final policy. Because doing so for all existing stacks will probably provoke minor heart attacks in operators everywhere, the new behavior is gated behind a feature flag. It can be retroactively switched on by people currently being bit by the size issues: ``` @aws-cdk/aws-iam:minimizePolicies ``` We will merge 2 statements if their effects are the same, and they are otherwise exactly the same apart from their `Action`, `Resource` or `Principal` declarations. We will not merge `NotXxx` statements, because doing so will change the meaning of the statement (`not A or not B ≠ not (A or B)`). There may be multiple possible merges that apply and we are not guaranteed to find the smallest merging, nor do we take effort to find all possible merges and do simplifications like `*`-subsumption. This is a starting point that should help out in the common case. Fixes #18774, fixes #16350, fixes #18457, fixes #18564, fixes #19276. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Apologies guys, it seems I confused this PR with #20189 which is still open. |
When we generate CodePipelines, we need to add an
sts:AssumeRole
statement for each Action in the pipeline,and a
Bucket.grantReadWrite()
statement for each region the pipeline is in,to the policy statement of the pipeline's Role.
For pipelines with many Actions and/or regions,
this makes the policy exceed the IAM limit of 10240 bytes.
Extract a new class from the CodePipeline CloudFormation Actions that caches the statements added to a given Principal by the 'Action' field,
and groups the statements with the same 'Actions' by adding elements to the 'Resource' field.
This dramatically reduces the duplication in the statement,
and increases the chances of it being smaller than the limit.
Use this new class in the
Pipeline
construct.Fixes #16244
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license