Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple permissions and roles created during deployment #285

Closed
pradeepd12 opened this issue Feb 2, 2018 · 20 comments
Closed

Multiple permissions and roles created during deployment #285

pradeepd12 opened this issue Feb 2, 2018 · 20 comments

Comments

@pradeepd12
Copy link

I deployed my infrastructure using SAM. After deployment, this is the list/count of resources:

 47 AWS::Lambda::Permission
 25 AWS::Lambda::Function
 24 AWS::IAM::Role

The deployment model (yaml) policies are currently set to 'AdministratorAccess' for each lambda function.

Why are all these permissions being created and how can I reduce the number? Has anyone experienced this? While browsing the cloudformation stack, I noticed that the logical ID of these permissions has a naming convention; ProxyApiRootPermissionStage and ProxyApiRootPermissionTest (roughly 2 permissions per function)

PS: Instead of the 'policies' property (in the YAML), I added the 'role' (predefined role ARN) to each lambda function. That brought down the number of roles to 1.

@quarryman
Copy link

You can either set Role or Permissions, but not both, as I understood from my own issue #254 (comment)

@jfuss
Copy link
Contributor

jfuss commented Feb 5, 2018

@pradeepd12 What is the concern or issue here? Do you just want to understand why we are creating so many or are there deeper concerns?

If you do not define a role for your function, SAM will create one per function for you. While this seems extraneous, it actually scopes each function individually. So now you can mutate the roles for each function without affected the collection. I do not recommend created one role for every function, due to blast radius concerns and to keep each function least privileged.

For each integration to the Lambda Function, we create a Lambda Permission. While this is probably over the top and we can possible reduce this, we are following this model until we have a use-case to merge them into one.

Can you explain why you want these reduced?

@pradeepd12
Copy link
Author

pradeepd12 commented Feb 5, 2018

I am trying to add versioning as part of the model definition. These are 3 lines added to the 'globals' section in the template.yaml (not formatted correctly here):

AutoPublishAlias: PROD
DeploymentPreference:
Type: AllAtOnce

When I add these 3 lines to the yaml and trigger deployment, I run into the following error (I dont see this error without those lines. If this is not the correct way to version/alias lambdas kindly let me know):

Failed to create the changeset: Waiter ChangeSetCreateComplete failed: Waiter encountered a terminal failure state Status: FAILED. Reason: Template format error: Number of resources, 229, is greater than maximum allowed, 200
Failed deploying CloudFormation package

This is real error i am trying to fix. I need to have versioning capability.

I have about 25 lambda functions in the yaml sharing an API gateway. Here is the sample definition for one of the lambdas (the rest look very similar too)

LambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: hello.js
      Policies: AdministratorAccess
      Runtime: nodejs6.10
      Timeout: 30
      FunctionName: LambdaFunction
      Events:
        ProxyApiRoot:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGatewayApi
            Path: /hello
            Method: post

So I am trying to understand why are such a large number of resources being created for 25 functions and an api gateway. The biggest number i see are Permissions (about 2 permissions per function)

@sudeepd
Copy link

sudeepd commented Feb 18, 2018

Anything on this ?
@jfuss : You mentioned creating 1 permission for each integration. What we are seeing is actually 2 permissions per function. With about 20+ lambdas in our application, I really dont want to go nested CF or terraform.
Can you confirm that you do one permission for each integration , and not two ?

@jabalsad
Copy link

jabalsad commented Feb 19, 2018

+1

From what I can tell, I get 2 AWS::Lambda::Permission resources for every Lambda function. One suffixed with the stage name, and one suffixed with Test-XXXXXXXX. Is there any reason two permission resources need to be created?

@jjmschofield
Copy link

+1

@jfuss
Copy link
Contributor

jfuss commented Feb 27, 2018

So did some digging, sorry it took me a while to get back to you guys (@jabalsad and @sudeepd).

You guys are correct. We do create 2 AWS::Lambda::Permission for API Gateway related resources (Implicit APIs, through the Functions' Event structure and explicit APIs, AWS::Serverless:API). The reason for this is to all the Test Button to work on the console. If we omit the PermissionTest, customers will not be able to use the Test button on the Console (to my understanding). For all other cases, we only create one permission.

This brought up a discussion within the team. This could be one of the places we reduce the number of permissions (having something that would exclude these permissions but would mean the test button on the console would not work). What do you guys think about that? It does not solve the root of the problem though. Limits will always be reached. I think nested stack support (#90) will also help here but I think we need a better solution for big templates. I do not know what that exactly is at the moment but would love to here your guys thoughts.

Side note: @jjmschofield, Please do not just leave comments with +1. It distracts the conversation. If this impacts you as well, please use the reaction feature on the posts. It will help us and the community figure out what issues are important without scrolling through the whole issue. Thanks.

@sudeepd
Copy link

sudeepd commented Feb 28, 2018

@jfuss The test button is toy/learning aid for all practical purposes. A real application coming out of a CI/CD pipeline most likely doesn't need it. If I do need a mechanism to trigger a lambda directly, I can roll my own very easily without the baggage of a test button eating up a resource for every single lambda.

  • At the very least, being able to disable the test permission declaratively in the template would be useful, and should be easy to implement. If I have explicit control of that then its my decision whether to use it or not.
  • Nested stack support would need more thought, but would help greatly. At a minimum, I would love to see the api gateway separated out from the logical grouping of functions. For example, if I have CRUD API, and I want 4 lambdas for it, then having a function group and APIG in the master , and the actual lambdas in the child would be very useful.

We are having to club multiple functions together in one lambda, and then do a routing from inside the lambda to work around this. So , now I need to manage my api definitions in the cloudformation, swagger AND code. Makes the solution brittle.

@jabalsad
Copy link

jabalsad commented Mar 1, 2018

@jfuss my 2 cents is that we don't use the Test button at all. The button is great for users new and unfamiliar with API Gateway. In our environment there's a lot of tooling around testing invocations so it serves no real purpose. We also tend to avoid using the console where we can, since pointing and clicking generally doesn't scale well for minimizing operational overhead.

I tend to agree that opting in or opting out of the Test button permission seems preferable. In our case, we will almost always opt-out, so I'd prefer it to be opt-out by default. Maybe new users will not agree :-)

Nested stacks are unfortunately not a good solution to this problem for us since we often bind resources such as S3 buckets to the Lambda functions (as an event source), and SAM doesn't support binding S3 and Lambda when they don't exist in the same template. We generally end-up having a mini-monolith of a serverless template (maybe not entirely applicable to API gateway, but worth mentioning nonetheless).

Another reason we're also creating a lot of permissions is due to this bug, so in our case we currently have 3 permissions for every lambda function!

@jfuss
Copy link
Contributor

jfuss commented Mar 5, 2018

@sudeepd I would agree that most production applications do not need the to use the Test Button on the console but seems to be used fairly regularly (I have seen a fair number of issues here about the Test Button here).

@jabalsad Sadly, I don't think we are in a position to opt out by default. Since we are already adding these permissions to every template, making the permissions opt-out by default would mean we are changing existing templates (which will cause an unexpected deployment and removal of a resource), which we avoid. We could probably fix this by a version bump and break this contract but for now, I think it has to be a customer driven opt-out. Using the Global section for this property will help in only needing to worry about it once for all functions or override in ones you would need it for.

I will see if I have some time in the coming days to take a peak at that issue. I will respond with findings there.

Also, sorry for the slow response. I was out of the office for a couple days last week.

@sudeepd
Copy link

sudeepd commented Mar 5, 2018

@jfuss : Thanks for the response. I am more than happy for test button users to continue clicking away. At the same time, my primary focus is getting my job done, and get in done in reasonable time. Even if I can opt out with a SAM parameter for the lambda, like DisableTestPermission : True , I am happy to take it. It makes it harder to justify SAM to my engineering team if they cannot get a production application to take off. It would be great if we can get something going for this. I am reasonably sure this is not too hard to implement.

@jfuss
Copy link
Contributor

jfuss commented Mar 7, 2018

@sudeepd Completely understandable.

Let's go a littler deeper into how we should be going about this. If we can land on a schema design here, I would be more than happy to get it prioritized as soon as I can.

I see more or less two options. If you can think of others that would be useful please let me know.

Option 1:
Add a flag key to each relevant resource. This would need to be added to both the Event sources for API and the AWS::Serverless::Api type. There may be other resources or events that need something similar as well. So this option would mean we would need to duplicate all the work across all of them (to whatever they would be).

LambdaFunction:
  Type: AWS::Serverless::Function
  Properties:
    ...
    Event:
      Type: API
      Properties:
        ...
        DisableTestPermissions: True

Option 2:
We can add some kind of 'Global' Flagging system. The idea behind this would to reduce the duplication across all Event Types and Resources and have a one stop shop for 'system' flags (something like compile flags might be a good way to think about this). This one isn't really fleshed out to me but I like the option more. Something like the below I think could be really powerful for not only this issue but possibly others as well.

Globals:
  SAMFlags:
    IsProduction: True
...
Resources:
  ...

I think it makes sense to put something like this into the globals section, but would love to here thoughts.

@sudeepd
Copy link

sudeepd commented Mar 7, 2018

@jfuss : I would lean towards a combination of option 1 and option 2, where option 1 wins contentions.
Global settings are good for the happy-path-no-errors thing, and so option 2 is preferred there. However, if there are instances ( for troubleshooting etc) where someone needs to selectively enable the button for just 1 lambda in a group of 10, then option 1 would be great.

So, have both if possible, but prioritize option 1 over option 2. Having option 2 without option 1 would be a problem, then you have a all or none situation. Just having option 1 leads to some duplication, but its lesser of the two evils

Just my $0.02, and sincerely appreciate the active involvement. Hope to see a resolution soon

@ashwgupt
Copy link

ashwgupt commented Apr 1, 2018

Looks like quite an interesting discussion and something many people would look upon given it will help people at least delay from CF resources limitation.

We have been facing the same resource limit issue where we tried a few solutions to split our SAM file while still using single API Gateway as I explained in #349 . One of those was to define API Gateway event permissions explicitly in APi Gateway's SAM file and not letting SAM do it under the hood. But there we faced the issue of loosing the capability of Test button on console, which in my view is a great tool at many times during development or even in Production to debug issues when gateway is not able to reach Function at all (like, lack of IPs etc).

Sorry for taking the discussion little off-track but can you @jfuss please advise what extra permission/role should we be giving to our API Gateway (given we define our roles ourselves and not by SAM) such that Test functionality work? Below is the only one we have given so far.

APIFunctionInvokeRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Effect: "Allow"
            Principal:
              Service:
                - "apigateway.amazonaws.com"
            Action:
              - "sts:AssumeRole"
  APIFunctionInvokeRolePolicies:
    Type: AWS::IAM::Policy
    Properties:
      PolicyName: APIFunctionInvokeRolePolicies
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Effect: "Allow"
            Action:
            - lambda:InvokeFunction
            Resource: "*"
      Roles:
        - !Ref APIFunctionInvokeRole

@marafee1243
Copy link

@jfuss Those options doenst work. Is there any change in version of template?

AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::Serverless-2016-10-31

Below is the error I got

Failed to create the changeset: Waiter ChangeSetCreateComplete failed: Waiter encountered a terminal failure state Status: FAILED. Reason: Transform AWS::Serverless-2016-10-31 failed with: Invalid Serverless Application Specification document. Number of errors found: 1. Resource with id [AbcEvent] is invalid. property DisableTestPermissions not defined for resource of type Api

@keetonian
Copy link
Contributor

@marafee1243 that was a proposed feature, and not something officially supported in SAM yet.

@ghost
Copy link

ghost commented Aug 9, 2019

Has there been any progress on the "DisableTestPermissions: True" idea?

With so much about the hard 200 resource limit, this would go a long way to helping...

@ashishusoni
Copy link

Is there a fix for this ..two un necessary permission for one function doesn't help

@jlhood
Copy link
Contributor

jlhood commented Aug 27, 2019

Hey all, we have a proposed SAM syntax for disabling creation of the extra test permission to address this issue. Please comment on #1102 and give us your feedback!

@ShreyaGangishetty
Copy link

ShreyaGangishetty commented Sep 6, 2019

Closing this issue as we opened another issue #1102 for fixing two permissions being created and also as this issue is similar to #337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests