Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hook for uploading AWS Lambda functions #252

Merged
merged 9 commits into from
Nov 28, 2016

Conversation

danielkza
Copy link
Contributor

Add a hook to upload AWS Lambda payloads to S3, so that complex code can be easily
used in templates.

The payload ZIP is built from a base folder with a list of
include/exclude patterns (using the formic library), then uploaded to S3
to either the default stacker bucket, or a custom bucket. The uploads
contain the MD5 of the payload in their key, to allow idempotent
deployments - unchanged payloads are not uploaded multiple times.

Usage examples for the YAML config and blueprint code have been added.

@danielkza danielkza force-pushed the lambda-hooks branch 2 times, most recently from ca10a75 to 2de190a Compare October 20, 2016 09:02
Copy link
Member

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have more to review, but keep getting pulled of it by work. Going to submit this, then I'll dig in more. I'm a little nervous about allowing the hooks themselves to chose top-level keys for the hook_data dictionary (though I like the concept of the hook_data dict itself) - but I don't have a good idea how we could make that better.

Also, can you add some docs about the new hook? This is great stuff - thanks!

http://www.aviser.asia/formic/doc/index.html
"""

root = os.path.abspath(root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to use https://github.com/remind101/stacker/blob/6a141b6d9c429d89158ad7ea274cdfcfbdc6060d/stacker/hooks/utils.py#L4 here instead, just to get the expanduser functionality as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I didn't end up using full_path because I don't want to call get_config_directory unless the path is relative, as that would force me to mock it in all the tests. I just call expanduser right when I retrieve the config key.

@@ -122,3 +128,13 @@ def get_fqn(self, name=None):

"""
return get_fqn(self._base_fqn, self.namespace_delimiter, name)

def set_hook_data(self, key, value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add the google style args/return/raises values for each of the new methods/functions you add/change? See http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html for an example or here for a stacker specific use of it:

https://github.com/remind101/stacker/blob/6a141b6d9c429d89158ad7ea274cdfcfbdc6060d/stacker/blueprints/base.py#L135

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

The file will be stored with it's MD5 appended to the key, to allow
avoiding repeated uploads in subsequent runs with unchanged content.

Returns a troposphere.aws_lambda.Code object containing the bucket and key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aws_lambda -> awslambda

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@danielkza
Copy link
Contributor Author

@phobologic

I'm a little nervous about allowing the hooks themselves to chose top-level keys for the hook_data dictionary (though I like the concept of the hook_data dict itself) - but I don't have a good idea how we could make that better.

I initially namespaced by the last component of the path name, but it didn't seem good to me: it would mean there is no way to easily have two hooks implement the same functionality in a different manner, or to replace a default hook with a customized version. It was also quite cumbersome to do:

self.context.hook_data['upload_lambda_functions']['MyFunction']

Also, can you add some docs about the new hook? This is great stuff - thanks!

Anything particular you are looking for? Most of the documentation is the upload_lambda_functions docstrings, and I added it to the doc. index together with the other hooks.

Copy link
Contributor

@ejholmes ejholmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't comment much on the code, but I really like the general direction of this. This easily lays down the foundation for #168. Awesome stuff.


def upload_lambda_functions(region, namespace, mappings, parameters,
context=None, **kwargs):
"""Builds Lambda payloads from user configuration and uploads then to S3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little typo here: s/then/them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielkza

I initially namespaced by the last component of the path name, but it didn't seem good to me: it would mean there is no way to easily have two hooks implement the same functionality in a different manner, or to replace a default hook with a customized version. It was also quite cumbersome to do:

Yeah, it almost makes me think that we should have given hooks a name or namespace property that would have allowed us to namespace this stuff naturally. I suppose we could add something like that and make it optional, and not do anything with the return value of the hook unless it comes from a hook with the name. What do you think about that? @mhahn any thoughts?

Anything particular you are looking for? Most of the documentation is the upload_lambda_functions docstrings, and I added it to the doc. index together with the other hooks.

Made me realize we have basically no documentation on the hooks, other than the API documentation, and your docstring may be the most complete & helpful we have so far :) Don't worry about it for now, I think this is good. I'll probably look at adding a link to the Config docs section about hooks to the API docs for all the existing hooks, if I can work that out.

Thanks again, especially for all the docstrings, that was a lot of work :)

"""Builds a Lambda payload from user configuration and uploads it to S3.

Args:
s3_conn (.S3.Client): S3 connection to use for operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be boto.S3.Client?

"""Create an S3 bucket if it does not already exist.

Args:
s3_conn (.S3.Client): S3 connection to use for operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - botocore.S3.Client? not sure what the actual path should look like

Copy link
Contributor Author

@danielkza danielkza Oct 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually experimenting with cross-referencing to the boto docs, but they seem to be quite broken - they don't prefix any of the classes with the correct package in the indexes, and don't even mention many classes.

I think the correct type is actually botocore.client.S3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now, sorry for the delay.

@phobologic
Copy link
Member

phobologic commented Oct 30, 2016

@mhahn @danielkza Any comment on my previous comment?

Yeah, it almost makes me think that we should have given hooks a name or namespace 
property that would have allowed us to namespace this stuff naturally. I suppose we could 
add something like that and make it optional, and not do anything with the return value of 
the hook unless it comes from a hook with the name. What do you think about that?

I'm even considering just breaking the old API with this, since we're moving into 1.0. It seems like the right time to make a change like this.

@danielkza
Copy link
Contributor Author

Adding a key to the hook configuration could make sense. Maybe call it data_key or something of the sort? That makes it clear what the purpose is, since a name would seem redundant when the path is already specified. Then it can be used to namespace the hook data.

I personally would prefer that the API didn't break, as am I selfishly and lazily delaying migrating my stacks to the new 1.0 parameter. It also doesn't seem strictly necessary, as ignoring the return values of hooks that are not properly tagged (or even just storing True/False) should not break anything.

Having the hook data be a flat dictionary has some advantages though: there's no need to merge data that comes from different hooks that happen to have the same key (as the hook itself can decide whether to override something or not), and it's easy to have a hook replace/complement another one.

It might be wise to try to think of a second use case other than the Lambda one, to figure out which solution works best overall.

@danielkza
Copy link
Contributor Author

I rebased to the latest master and fixed some typos. I'm not really sure what's the best way to proceed about the hook API dilemma, but please feel free to make/suggest any changes from my first attempt.

@phobologic
Copy link
Member

Ok, I've thought about it some more, and I think we should go with the idea I had before:

Yeah, it almost makes me think that we should have given hooks a name or namespace property that would have allowed us to namespace this stuff naturally. I suppose we could add something like that and make it optional, and not do anything with the return value of the hook unless it comes from a hook with the name. What do you think about that? @mhahn any thoughts?

I'm fine with using data_key in place of name or namespace, but I think that we should ignore any returned data on a hook that doesn't have an identifier. We can still use the idea that returning anything that evaluates to True means a positive run, and anything that returns a value that is False (or raises an exception) is broken.

Sorry for the delay again, @danielkza, but could you take a stab at this? If not, I can go ahead and write a separate PR that creates the basics of the new system.

One thing we should consider is what to do if two hooks have the same data_key - my guess is it would just overwrite at this point. Or we could throw an error - assuming that it's a mistake more often than not.

@danielkza
Copy link
Contributor Author

@phobologic I think it would be better if you implemented the hook data system first as you find best, then I can fix my code to adapt to it afterwards.

@phobologic
Copy link
Member

@danielkza sounds good, I'll work on that now. Thanks for your patience!

phobologic added a commit that referenced this pull request Nov 26, 2016
This is a spike into the ideas discussed in:

#252
@phobologic phobologic mentioned this pull request Nov 26, 2016
@phobologic
Copy link
Member

@danielkza @mhahn check out #270 - one design decision I made was to disallow duplicate data_keys.

@phobologic
Copy link
Member

Thanks @danielkza - I've gone ahead and merged #270.

Building on the recently added hook_data in Context, add a hook to
upload AWS Lambda payloads to S3, so that complex code can be easily
used in templates.

The payload ZIP is built from a base folder with a list of
include/exclude patterns (using the formic library), then uploaded to S3
to either the default stacker bucket, or a custom bucket. The uploads
contain the MD5 of the payload in their key, to allow idempotent
deployments - unchanged payloads are not uploaded multiple times.

Usage examples for the YAML config and blueprint code have been added.
@danielkza
Copy link
Contributor Author

@phobologic Rebased to the latest master with the update to match the final API.

@phobologic
Copy link
Member

This looks good @danielkza - thanks! Can you do one more pass and take a look at all the logger calls? It looks like they're all at info, which may be a bit noisy. I'm going to leave it up to your judgement though if that's the right thing to do or not. Thanks again!

@danielkza
Copy link
Contributor Author

danielkza commented Nov 27, 2016

@phobologic I changed printing the function name and list of files to the debug level, see if it helps.

@phobologic
Copy link
Member

Thanks @danielkza !

@phobologic phobologic merged commit 3e46d51 into cloudtools:master Nov 28, 2016
phrohdoh pushed a commit to phrohdoh/stacker that referenced this pull request Dec 18, 2018
This is a spike into the ideas discussed in:

cloudtools#252
phrohdoh pushed a commit to phrohdoh/stacker that referenced this pull request Dec 18, 2018
Add hook for uploading AWS Lambda functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants