Skip to content

Commit

Permalink
Merge pull request #6 from brennerm/implement-prometheus-exporter
Browse files Browse the repository at this point in the history
implement Prometheus exporter
  • Loading branch information
brennerm authored Mar 9, 2021
2 parents 76f9f8d + fe23543 commit f6a34d2
Show file tree
Hide file tree
Showing 14 changed files with 472 additions and 33 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added

- implement Prometheus exporter that provides all quota results

### Changed

- display AWS account ID instead of profile name in check scope

## [1.1.0] - 2021-02-27

### Added
Expand Down
60 changes: 59 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ A tool that helps keeping track of your AWS quota utilization. It'll determine t

This is especially useful cause today, cloud resources are being created from all kinds of sources, e.g. IaC and Kubernetes operators. This tool will give you a head start for requesting quota increases before you hit a quota limit to prevent being stuck with a production system not being able to scale anymore.

A usual use case is to add it to your CI pipeline right after applying your IaC or run it on a regular basis. Feel free to leave a vote on [this issue](https://github.com/brennerm/aws-quota-checker/issues/1) if you'd like to see a Prometheus exporter.
A usual use case is to add it to your CI pipeline right after applying your IaC or run it on a regular basis. It also comes with a Prometheus exporter mode that allows you to visualize the data with your tool of choice, e.g. Grafana.

![Example Grafana dashboard that uses metrics of the Prometheus exporter](https://raw.githubusercontent.com/brennerm/aws-quota-checker/master/img/example-grafana-dashboard.png)

## Installation

Expand Down Expand Up @@ -63,6 +65,62 @@ $ aws-quota-checker check-instance vpc_acls_per_vpc vpc-0123456789
Network ACLs per VPC [default/eu-central-1/vpc-0123456789]: 0/200
```
### Prometheus exporter
The Prometheus exporter requires additional dependencies that you need to install with `pip install aws-quota-checker[prometheus]`.
```bash
$ aws-quota-checker prometheus-exporter all
AWS profile: default | AWS region: us-east-1 | Active checks: am_mesh_count,asg_count,cf_stack_count,cw_alarm_count,dyndb_table_count,ebs_snapshot_count,ec2_eip_count,ec2_on_demand_f_count,ec2_on_demand_g_count,ec2_on_demand_inf_count,ec2_on_demand_p_count,ec2_on_demand_standard_count,ec2_on_demand_x_count,ec2_spot_f_count,ec2_spot_g_count,ec2_spot_inf_count,ec2_spot_p_count,ec2_spot_standard_count,ec2_spot_x_count,ec2_tgw_count,ec2_vpn_connection_count,ecs_count,eks_count,elasticbeanstalk_application_count,elasticbeanstalk_environment_count,elb_alb_count,elb_clb_count,elb_listeners_per_alb,elb_listeners_per_clb,elb_listeners_per_nlb,elb_nlb_count,elb_target_group_count,iam_attached_policy_per_group,iam_attached_policy_per_role,iam_attached_policy_per_user,iam_group_count,iam_policy_count,iam_policy_version_count,iam_server_certificate_count,iam_user_count,ig_count,lc_count,ni_count,route53_health_check_count,route53_hosted_zone_count,route53_records_per_hosted_zone,route53_reusable_delegation_set_count,route53_traffic_policy_count,route53_traffic_policy_instance_count,route53_vpcs_per_hosted_zone,route53resolver_endpoint_count,route53resolver_rule_association_count,route53resolver_rule_count,s3_bucket_count,secretsmanager_secrets_count,sg_count,sns_pending_subscriptions_count,sns_subscriptions_per_topic,sns_topics_count,vpc_acls_per_vpc,vpc_count,vpc_subnets_per_vpc
09-Mar-21 20:15:11 [INFO] botocore.credentials - Found credentials in shared credentials file: ~/.aws/credentials
09-Mar-21 20:15:11 [INFO] aws_quota.prometheus - starting /metrics endpoint on port 8080
09-Mar-21 20:15:11 [INFO] aws_quota.prometheus - collecting checks
09-Mar-21 20:15:19 [INFO] aws_quota.prometheus - collected 110 checks
09-Mar-21 20:15:19 [INFO] aws_quota.prometheus - refreshing limits
09-Mar-21 20:16:34 [INFO] aws_quota.prometheus - limits refreshed
09-Mar-21 20:16:34 [INFO] aws_quota.prometheus - refreshing current values
09-Mar-21 20:18:15 [INFO] aws_quota.prometheus - current values refreshed
```
The exporter will return the following metrics:
- awsquota_$checkkey: the current value of each quota check
- awsquota_$checkkey_limit: the limit value of each quota check
- awsquota_check_count: the number of quota checks that are being executed
- awsquota_check_limits_duration_seconds: the number of seconds that was necessary to query all quota limits
- awsquota_check_currents_duration_seconds: the number of seconds that was necessary to query all current quota values
- awsquota_info: info gauge that will expose the current AWS account and region as labels
Depending on the check type, labels for the AWS account, the AWS region and the instance ID will be attached to the metric.
Below you can find a few example metrics:
```
# HELP awsquota_info AWS quota checker info
# TYPE awsquota_info gauge
awsquota_info{account="123456789",region="us-east-1"} 1.0
# HELP awsquota_check_count Number of AWS Quota Checks
# TYPE awsquota_check_count gauge
awsquota_check_count 110.0
# HELP awsquota_collect_checks_duration_seconds Time to collect all quota checks
# TYPE awsquota_collect_checks_duration_seconds gauge
awsquota_collect_checks_duration_seconds{account="123456789",region="us-east-1"} 7.885610818862915
# HELP awsquota_asg_count_limit Auto Scaling groups per region Limit
# TYPE awsquota_asg_count_limit gauge
awsquota_asg_count_limit{account="123456789",region="us-east-1"} 200.0
# HELP awsquota_ec2_on_demand_standard_count Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) EC2 instances
# TYPE awsquota_ec2_on_demand_standard_count gauge
awsquota_ec2_on_demand_standard_count{account="123456789"} 22.0
# HELP awsquota_elb_listeners_per_clb Listeners per Classic Load Balancer
# TYPE awsquota_elb_listeners_per_clb gauge
awsquota_elb_listeners_per_clb{account="123456789",instance="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",region="us-east-1"} 10.0
awsquota_elb_listeners_per_clb{account="123456789",instance="bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",region="us-east-1"} 2.0
```
As querying all quotas, depending on the number of resources to check, may take some time, the exporter works asynchronously. That means requesting the /metrics endpoint will return cached results and not trigger a recheck of all quotas. Instead all checks will be executed and refreshed in the background. That's why no metrics will be available directly after starting the exporter.
Hence it doesn't make too much sense to scrape the /metrics every few seconds cause the values will only refresh once in a while. The check intervals of the background jobs can be adjusted to your needs using command line arguments.
## Missing a quota check?
Feel free to create a new issue with the _New Check_ label including a description which quota check you are missing.
24 changes: 17 additions & 7 deletions aws_quota/check/elb.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from aws_quota.exceptions import InstanceWithIdentifierNotFound
import typing
import boto3
from .quota_check import QuotaCheck, InstanceQuotaCheck, QuotaScope
Expand Down Expand Up @@ -36,9 +37,12 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def current(self):
return len(self.boto_session.client('elb').describe_load_balancers(
LoadBalancerNames=[self.instance_id]
)['LoadBalancerDescriptions'][0]['ListenerDescriptions'])
try:
return len(self.boto_session.client('elb').describe_load_balancers(
LoadBalancerNames=[self.instance_id]
)['LoadBalancerDescriptions'][0]['ListenerDescriptions'])
except self.boto_session.client('elb').exceptions.AccessPointNotFoundException as e:
raise InstanceWithIdentifierNotFound(self) from e


class NetworkLoadBalancerCountCheck(QuotaCheck):
Expand Down Expand Up @@ -67,8 +71,11 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def current(self):
return len(self.boto_session.client('elbv2').describe_listeners(
LoadBalancerArn=self.instance_id)['Listeners'])
try:
return len(self.boto_session.client('elbv2').describe_listeners(
LoadBalancerArn=self.instance_id)['Listeners'])
except self.boto_session.client('elbv2').exceptions.LoadBalancerNotFoundException as e:
raise InstanceWithIdentifierNotFound(self) from e


class ApplicationLoadBalancerCountCheck(QuotaCheck):
Expand Down Expand Up @@ -97,8 +104,11 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def current(self) -> int:
return len(self.boto_session.client('elbv2').describe_listeners(
LoadBalancerArn=self.instance_id)['Listeners'])
try:
return len(self.boto_session.client('elbv2').describe_listeners(
LoadBalancerArn=self.instance_id)['Listeners'])
except self.boto_session.client('elbv2').exceptions.LoadBalancerNotFoundException as e:
raise InstanceWithIdentifierNotFound(self) from e


class TargetGroupCountCheck(QuotaCheck):
Expand Down
16 changes: 13 additions & 3 deletions aws_quota/check/iam.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from aws_quota.exceptions import InstanceWithIdentifierNotFound
import typing

import boto3
Expand Down Expand Up @@ -89,7 +90,10 @@ def maximum(self):

@property
def current(self):
return len(self.boto_session.client('iam').list_user_policies(UserName=self.instance_id)['PolicyNames'])
try:
return len(self.boto_session.client('iam').list_user_policies(UserName=self.instance_id)['PolicyNames'])
except self.boto_session.client('iam').exceptions.NoSuchEntityException as e:
raise InstanceWithIdentifierNotFound(self) from e

class AttachedPolicyPerGroupCheck(InstanceQuotaCheck):
key = "iam_attached_policy_per_group"
Expand All @@ -106,7 +110,10 @@ def maximum(self):

@property
def current(self):
return len(self.boto_session.client('iam').list_group_policies(GroupName=self.instance_id)['PolicyNames'])
try:
return len(self.boto_session.client('iam').list_group_policies(GroupName=self.instance_id)['PolicyNames'])
except self.boto_session.client('iam').exceptions.NoSuchEntityException as e:
raise InstanceWithIdentifierNotFound(self) from e

class AttachedPolicyPerRoleCheck(InstanceQuotaCheck):
key = "iam_attached_policy_per_role"
Expand All @@ -123,4 +130,7 @@ def maximum(self):

@property
def current(self):
return len(self.boto_session.client('iam').list_role_policies(RoleName=self.instance_id)['PolicyNames'])
try:
return len(self.boto_session.client('iam').list_role_policies(RoleName=self.instance_id)['PolicyNames'])
except self.boto_session.client('iam').exceptions.NoSuchEntityException as e:
raise InstanceWithIdentifierNotFound(self) from e
17 changes: 17 additions & 0 deletions aws_quota/check/quota_check.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from aws_quota.utils import get_account_id
import enum
import typing

Expand All @@ -23,6 +24,22 @@ def __init__(self, boto_session: boto3.Session) -> None:
self.boto_session = boto_session
self.sq_client = boto_session.client('service-quotas')

def __str__(self) -> str:
return f'{self.key}{self.label_values}'

@property
def label_values(self):
if self.scope == QuotaScope.ACCOUNT:
return {'account': get_account_id(self.boto_session)}
elif self.scope == QuotaScope.REGION:
return {'account': get_account_id(self.boto_session), 'region': self.boto_session.region_name}
elif self.scope == QuotaScope.INSTANCE:
return {
'account': get_account_id(self.boto_session),
'region': self.boto_session.region_name,
'instance': self.instance_id
}

@property
def maximum(self) -> int:
try:
Expand Down
21 changes: 17 additions & 4 deletions aws_quota/check/route53.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from aws_quota.exceptions import InstanceWithIdentifierNotFound
import typing
import boto3
from .quota_check import InstanceQuotaCheck, QuotaCheck, QuotaScope
Expand Down Expand Up @@ -84,11 +85,17 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def maximum(self):
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_RRSETS_BY_ZONE', HostedZoneId=self.instance_id)['Limit']['Value']
try:
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_RRSETS_BY_ZONE', HostedZoneId=self.instance_id)['Limit']['Value']
except self.boto_session.client('route53').exceptions.NoSuchHostedZone as e:
raise InstanceWithIdentifierNotFound(self) from e

@property
def current(self):
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_RRSETS_BY_ZONE', HostedZoneId=self.instance_id)['Count']
try:
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_RRSETS_BY_ZONE', HostedZoneId=self.instance_id)['Count']
except self.boto_session.client('route53').exceptions.NoSuchHostedZone as e:
raise InstanceWithIdentifierNotFound(self) from e


class AssociatedVpcHostedZoneCheck(InstanceQuotaCheck):
Expand All @@ -102,8 +109,14 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def maximum(self):
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_VPCS_ASSOCIATED_BY_ZONE', HostedZoneId=self.instance_id)['Limit']['Value']
try:
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_VPCS_ASSOCIATED_BY_ZONE', HostedZoneId=self.instance_id)['Limit']['Value']
except self.boto_session.client('route53').exceptions.NoSuchHostedZone as e:
raise InstanceWithIdentifierNotFound(self) from e

@property
def current(self):
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_VPCS_ASSOCIATED_BY_ZONE', HostedZoneId=self.instance_id)['Count']
try:
return self.boto_session.client('route53').get_hosted_zone_limit(Type='MAX_VPCS_ASSOCIATED_BY_ZONE', HostedZoneId=self.instance_id)['Count']
except self.boto_session.client('route53').exceptions.NoSuchHostedZone as e:
raise InstanceWithIdentifierNotFound(self) from e
8 changes: 7 additions & 1 deletion aws_quota/check/sns.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from aws_quota.exceptions import InstanceWithIdentifierNotFound
import typing

import boto3
Expand Down Expand Up @@ -37,12 +38,17 @@ class SubscriptionsPerTopicCheck(InstanceQuotaCheck):
description = "SNS subscriptions per topics"
service_code = 'sns'
quota_code = 'L-A4340BCD'
instance_id = 'Topic ARN'

@staticmethod
def get_all_identifiers(session: boto3.Session) -> typing.List[str]:
return [topic['TopicArn'] for topic in session.client('sns').list_topics()['Topics']]

@property
def current(self):
topic_attrs = self.boto_session.client('sns').get_topic_attributes(TopicArn=self.instance_id)['Attributes']
try:
topic_attrs = self.boto_session.client('sns').get_topic_attributes(TopicArn=self.instance_id)['Attributes']
except self.boto_session.client('sns').exceptions.NotFoundException as e:
raise InstanceWithIdentifierNotFound(self) from e

return int(topic_attrs['SubscriptionsConfirmed']) + int(topic_attrs['SubscriptionsPending'])
36 changes: 26 additions & 10 deletions aws_quota/check/vpc.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
from aws_quota.exceptions import InstanceWithIdentifierNotFound
import typing

import boto3
import botocore.exceptions
from .quota_check import QuotaCheck, InstanceQuotaCheck, QuotaScope

def check_if_vpc_exists(session: boto3.Session, vpc_id: str) -> bool:
client = session.client('ec2')
try:
client.describe_vpcs(VpcIds=[vpc_id])
except botocore.exceptions.ClientError as e:
return False
return True


class VpcCountCheck(QuotaCheck):
key = "vpc_count"
Expand Down Expand Up @@ -66,11 +76,14 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def current(self):
return len(self.boto_session.client('ec2').describe_subnets(Filters=[
{
'Name': 'vpc-id',
'Values': [self.instance_id]
}])['Subnets'])
if check_if_vpc_exists(self.boto_session, self.instance_id):
return len(self.boto_session.client('ec2').describe_subnets(Filters=[
{
'Name': 'vpc-id',
'Values': [self.instance_id]
}])['Subnets'])
else:
raise InstanceWithIdentifierNotFound(self)


class AclsPerVpcCountCheck(InstanceQuotaCheck):
Expand All @@ -86,8 +99,11 @@ def get_all_identifiers(session: boto3.Session) -> typing.List[str]:

@property
def current(self) -> int:
return len(self.boto_session.client('ec2').describe_network_acls(Filters=[
{
'Name': 'vpc-id',
'Values': [self.instance_id]
}])['NetworkAcls'])
if check_if_vpc_exists(self.boto_session, self.instance_id):
return len(self.boto_session.client('ec2').describe_network_acls(Filters=[
{
'Name': 'vpc-id',
'Values': [self.instance_id]
}])['NetworkAcls'])
else:
raise InstanceWithIdentifierNotFound(self)
Loading

0 comments on commit f6a34d2

Please sign in to comment.