Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(unit-naming): examples and improvements for naming checks #257

Merged

Conversation

dhutchison
Copy link
Collaborator

@dhutchison dhutchison commented Aug 5, 2023

This PR adds another directory under examples/unit for showing the different ways that resource names can be checked as part of unit tests. This same approach can be used for other properties, but naming in the angle from the top level README that brought my interest in this library :)

This covers exact value checks and adds functionality for comparing against regex patterns, including at the Stack level for all resources of a type for ensuring compliance against naming conventions.

The stack level piece is used by the examples/unit/naming/test_naming_advanced.py, the object structure that gets passed to assert_resource_type_property_value_conventions is something that I'm not entirely sure about. In tests I've been writing against this we have a common test util class that assumes positions in a tuple for a bunch of this information instead of named fields, but this might be more maintainable.

I'm marking this as ready for review just now, but expecting it'll need a bit of revision based on the above point.

BTW it looks like you have added me to the DSTY organisation, but I can't push feature branches to this repository still (hence why the PR is from my fork). That might be why I thought I couldn't edit labels on PRs.

@dhutchison dhutchison marked this pull request as ready for review August 5, 2023 22:09
@shadycuz
Copy link
Member

shadycuz commented Aug 6, 2023

BTW it looks like you have added me to the DSTY organisation, but I can't push feature branches to this repository still (hence why the PR is from my fork). That might be why I thought I couldn't edit labels on PRs.

I will look into this

Copy link
Member

@shadycuz shadycuz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting feature and I want to get it a more though review in the next day or two.


This library does not aim to keep a record internally of what types support custom names or where that information may be stored.

When writing this type of test I will commonly set the region that Cloud Radar is using to one that does not exist in reality, or certainly is not one that we use - this helps catch where a region may be hard coded in a template leading to incorrect naming when the stack is deployed to a different region. The region used when the stack is rendered can be set like this:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the region that Cloud Radar is using to one that does not exist in reality

Bug or feature ¯_(ツ)_/¯

@shadycuz shadycuz added enhancement New feature or request minor A new feature has been added. labels Aug 6, 2023
@shadycuz
Copy link
Member

shadycuz commented Aug 6, 2023

Well the GitHub org access controls are somewhat of a mess, but you have the correct permissions now, at least for this repo.

shadycuz
shadycuz previously approved these changes Aug 12, 2023
Copy link
Member

@shadycuz shadycuz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of this feature. The ability to enforce a name scheme like ${CompanyName}-${EnvironmentName}-${ResourceName} but to be honest, I'm not super in love with the current implementation. It seems to only work for individual templates and wouldn't work very well when trying to enforce naming across many teams and templates.

I do think merging this MR is better than not merging it. At least we will have some kinda mechanism for managing resource names. I will create an issue and we can discuss some ideas for improvements. Maybe a v2 of this feature?

Feel free to merge when you are ready. Once this PR is merged I will prepare a new release.

Comment on lines 79 to 100
type_patterns = {
"AWS::EFS::FileSystem": {
"Tag": "Name",
# This TagProperty is optional. The default is 'Tags',
# but some resources use a different property name for
# their tags.
"TagProperty": "FileSystemTags",
"Pattern": r"^[a-z0-9-]*-vol$",
},
"AWS::S3::Bucket": {
"Property": "BucketName",
"Pattern": r"^[a-z0-9-]*-xx-west-3[a-z0-9-]*$",
},
"AWS::S3::BucketPolicy": {
# The BucketPolicy type does not support custom names. If we do not
# want to set fail_on_missing_type=False when we call the assertion
# below then we need to include this type in the dict, and set it
# not to be checked.
# This approach ensures that types do not slip through unintentionally.
"Check": False
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this will be common that users only want to check the names of certain resources? What if they want to check all resources that allow you to set a name or just all resources where a name was set.

Copy link
Collaborator Author

@dhutchison dhutchison Aug 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly might be the case where flexibility has been added in where it isn't required. At my company we are very much in the camp that anything that can have a name will have a name and a defined convention (we use CloudFormation Guard for ensuring the properties / tags are set - but it can't do this parameter substitution to do the convention enforcement).

I can see the case where at least checking certain resources (like S3 buckets) contain the region name they exist in, but the resource level checks could be used for them without the stack level validation.

From the general point of your review, it's this stack level validation which has the pain points - not the resource level changes? If so I think I'd rather split this up instead of putting something in them making breaking changes to how it is configured later on.

E AssertionError: Resource 'rFileSystem' tag 'Name' value 'my-test-xx-west-3-vol' did not match expected pattern '^[a-z0-9-]*-FAIL$'.
```

The default behaviour is for `fail_on_missing_type` to be True, so if a resource is encountered with a type not defined in `type_patterns`, an assertion error like the following will be raised.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have to define every type of resource in your template into your test as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mainly to ensure that a named resource does not slip through without a convention check due it being missing from the type_patterns configuration.

Comment on lines 70 to 80
type_patterns = {
"AWS::EFS::FileSystem": {
"Tag": "Name",
"TagProperty": "FileSystemTags",
"Pattern": r"^[a-z0-9-]*-FAIL$",
},
"AWS::S3::Bucket": {
"Property": "BucketName",
"Pattern": r"^[a-z0-9-]*-xx-west-3[a-z0-9-]*$",
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it will not scale very well for a large organization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our current configuration (which isn't this exact format) has 50+ entries in it.

I was trying to avoid making this library need to be aware of where a name for a resource type was held, or if it was having some ability to override it (for where this didn't know where a name was stored). It doesn't help that the AWS documentation for the name type doesn't include any resources where the name comes from a Tag. Although, just handling the name property side of the puzzle is better than needing the end user to work this out themselves. Then the configuration could be simplified to:

    type_patterns = {
        "AWS::S3::Bucket": r"^[a-z0-9-]*-xx-west-3[a-z0-9-]*$",
    }

Do you have any thoughts on an alternative configuration format (/ sharing approach)?

The approach to sharing this configuration between tests & repositories is the bit I've not really worked out yet - we've got a fair bit of copy/paste between repositories just now but that is not going to be maintainable for long. In a repository that may have a half dozen templates in it we typically have a file common_test_functions.py which has a method get_naming_dict(<some params>) then each template specific test calls this common method. The initial thinking had been that this test functions file would end up in an internal library that we would share as a dependency between our repositories, but equally could be a configuration file that is shared with submodules or some other approach.

@shadycuz
Copy link
Member

@dhutchison Between November and February, I have the most free time to work on my side projects. So I should be able to put more attention on this.

We have two different ideas. Mine is around enforcing a common prefix to resource names. Something like ${CompanyName}-${EnvironmentName}-${ResourceName}. Where ResourceName could be anything.

Your idea is more around checking that ResourceName is a specific value/format. Like your example the region name be included if the resource is a bucket.

Another thing that is different in my approach is that I want to enforce a single schema on every resource, but your approach is for enforcing different rules for different resource types?

I can always build my approach in the future. Since its more about company-wide name prefixes and less about resource names.

While reviewing your examples above, I realize that your exact idea could be slightly altered to be extremely powerful.

Imagine a "hooks" system where you define a single test function that you want to run for N resource of X type. Example Code:

from cloud_radar.cf.unit import Template

def my_s3_hook(resource_data, stack_info) -> None:

    name: str = resource_data['BucketName']

    assert stack_info.Region in name

    assert resource_data["BucketEncryption"] is True


Template.hooks = {
    "AWS::S3::Bucket": [my_s3_hook]
}

In the example above, the my_s3_hook function would be called on every AWS::S3::Bucket resource that is encountered when we render the template. This would allow you to check a resource name and other things about the resource and it would remove the somewhat complicated type_patterns object.

These hooks could also be configured Globaly for every Template, or locally for every render of a template or for a specific render of a template. For example

# Global example of hooks for every 'Template' object
import my_company_cloud_radar_hooks # You wouldnt have to import but you would have to pip install it.
# similar to how flake8 and octoprint plugins work.

# Local Hooks for every 'Template' object, global hooks are already present and cant be removed
Template.hooks = {
    "AWS::S3::Bucket": [my_s3_hook]
}

template = Template.from_yaml(template_path.resolve())

# Hooks for a single template, wouldnt be applied to other templates
template.hooks = {}

Let me know what you think =)

@dhutchison
Copy link
Collaborator Author

Hi @shadycuz, I too have been swamped & not had a chance to get back to any of this. I've been meaning to for a few weeks even just to do a merge & revisit like you had suggested.

Oh I like that hooks idea, that sounds very powerful & I can see it's potential for doing a lot more common checks. I'll have a think about it & take a closer look when I get a chance (and go learn about those plugin systems - that's the kind of hint I needed).

@dhutchison
Copy link
Collaborator Author

I think in the next few weeks I'm going to have a bit more brain space to work on this idea. I'm going to move a chunk of this conversation into a Discussion so we can iterate on the design a bit. I think this PR as-is will get closed but I might be able to reuse some of the code for utility methods for retrieving tags and such.

I've had a dig through how flake8 plugins are registered and it looks pretty straight forward (still to try PoC'ing an implementation though).

The design of the hooks dict and what the functions take in is going to be a key bit. I think it should be able to cover more than just resources, so you can enforce things like all parameters have descriptions. Also there is a CloudFormation standard somewhere (I can't find the link right now) around items in the cloud formation template being prefixed with a letter for their type (so parameters are always pParameterName etc) - so they are not resource type specific. Just putting these down just now so I remember what additional constraints could apply to the design.

@dhutchison
Copy link
Collaborator Author

I should stop predicting when I'll have brain space - been wanting to get back to this but just not had a chance.

I've removed the more compliance level checks from this PR while still leaving most of the example & the additional helper functions to check a property against a regex. Hopefully this is enough to get this merged then can proceed with the hooks approach.

Copy link

codecov bot commented Nov 16, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (46c3786) 95.99% compared to head (4c51da0) 95.97%.

Files Patch % Lines
src/cloud_radar/cf/unit/_resource.py 95.45% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #257      +/-   ##
===========================================
- Coverage    95.99%   95.97%   -0.03%     
===========================================
  Files           10       10              
  Lines          674      695      +21     
  Branches       143      145       +2     
===========================================
+ Hits           647      667      +20     
  Misses          16       16              
- Partials        11       12       +1     
Flag Coverage Δ
unit 95.97% <96.15%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shadycuz
Copy link
Member

@dhutchison I see conflicts, likely because of the other PR I just merged.

I haven't had as much free time as I was expecting either and the time I did get I spent on my other project cf2tf.

I will be spending more time on this project this week and weekend.

@dhutchison
Copy link
Collaborator Author

No problem. I too have had much less time than I'd hoped.

I can get the conflicts straightened out tonight if that helps?

@dhutchison
Copy link
Collaborator Author

dhutchison commented Dec 12, 2023

@shadycuz I've sorted out the merge conflicts and added a little bit of coverage which was missing. The GitHub action failure seems to be an issue relating to uploading code coverage - is that something you have seen before? I can't see how any code change I made could cause that failure

@shadycuz shadycuz merged commit 53b3680 into DontShaveTheYak:develop Dec 12, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request minor A new feature has been added.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants