Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect cloudwatch stats in a more efficient and cost effective manner #5544

Merged
merged 24 commits into from
Apr 23, 2019
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 51 additions & 34 deletions plugins/inputs/cloudwatch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ API endpoint. In the following order the plugin will attempt to authenticate.

```toml
[[inputs.cloudwatch]]
## Amazon Region (required)
## Amazon Region
region = "us-east-1"

## Amazon Credentials
Expand All @@ -28,12 +28,12 @@ API endpoint. In the following order the plugin will attempt to authenticate.
## 4) environment variables
## 5) shared credentials file
## 6) EC2 Instance Profile
#access_key = ""
#secret_key = ""
#token = ""
#role_arn = ""
#profile = ""
#shared_credential_file = ""
# access_key = ""
# secret_key = ""
# token = ""
# role_arn = ""
# profile = ""
# shared_credential_file = ""

## Endpoint to make request against, the correct endpoint is automatically
## determined and this option should only be set if you wish to override the
Expand All @@ -54,32 +54,44 @@ API endpoint. In the following order the plugin will attempt to authenticate.
## Collection Delay (required - must account for metrics availability via CloudWatch API)
delay = "5m"

## Override global run interval (optional - defaults to global interval)
## Recomended: use metric 'interval' that is a multiple of 'period' to avoid
## Recommended: use metric 'interval' that is a multiple of 'period' to avoid
## gaps or overlap in pulled data
interval = "5m"

## Configure the TTL for the internal cache of metrics.
## Defaults to 1 hr if not specified
# cache_ttl = "1h"

## Metric Statistic Namespace (required)
namespace = "AWS/ELB"

## Maximum requests per second. Note that the global default AWS rate limit is
## 400 reqs/sec, so if you define multiple namespaces, these should add up to a
## maximum of 400. Optional - default value is 200.
## 50 reqs/sec, so if you define multiple namespaces, these should add up to a
## maximum of 50. Default value is 25.
## See http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_limits.html
ratelimit = 200
# ratelimit = 25

## Namespace-wide statistic filters. These allow fewer queries to be made to
## cloudwatch.
# statistic_exclude = [ "average", "sum", minimum", "maximum", sample_count" ]
# statistic_include = []

## Metrics to Pull (optional)
## Metrics to Pull
## Defaults to all Metrics in Namespace if nothing is provided
## Refreshes Namespace available metrics every 1h
[[inputs.cloudwatch.metrics]]
names = ["Latency", "RequestCount"]

## Dimension filters for Metric. These are optional however all dimensions
## defined for the metric names must be specified in order to retrieve
## the metric statistics.
[[inputs.cloudwatch.metrics.dimensions]]
name = "LoadBalancerName"
value = "p-example"
#[[inputs.cloudwatch.metrics]]
# names = ["Latency", "RequestCount"]
#
# ## Statistic filters for Metric. These allow for retrieving specific
# ## statistics for an individual metric.
# # statistic_exclude = [ "average", "sum", minimum", "maximum", sample_count" ]
# # statistic_include = []
#
# ## Dimension filters for Metric. All dimensions defined for the metric names
# ## must be specified in order to retrieve the metric statistics.
# [[inputs.cloudwatch.metrics.dimensions]]
# name = "LoadBalancerName"
# value = "p-example"
```
#### Requirements and Terminology

Expand All @@ -97,17 +109,21 @@ wildcard dimension is ignored.

Example:
```
[[inputs.cloudwatch.metrics]]
names = ["Latency"]
[[inputs.cloudwatch]]
period = "1m"
interval = "5m"

## Dimension filters for Metric (optional)
[[inputs.cloudwatch.metrics.dimensions]]
name = "LoadBalancerName"
value = "p-example"
[[inputs.cloudwatch.metrics]]
names = ["Latency"]

[[inputs.cloudwatch.metrics.dimensions]]
name = "AvailabilityZone"
value = "*"
## Dimension filters for Metric (optional)
[[inputs.cloudwatch.metrics.dimensions]]
name = "LoadBalancerName"
value = "p-example"

[[inputs.cloudwatch.metrics.dimensions]]
name = "AvailabilityZone"
value = "*"
```

If the following ELBs are available:
Expand All @@ -124,9 +140,11 @@ Then 2 metrics will be output:
If the `AvailabilityZone` wildcard dimension was omitted, then a single metric (name: `p-example`)
would be exported containing the aggregate values of the ELB across availability zones.

To maximize efficiency and savings, consider making fewer requests by increasing `interval` but keeping `period` at the duration you would like metrics to be reported. The above example will request metrics from Cloudwatch every 5 minutes but will output five metrics timestamped one minute apart.

#### Restrictions and Limitations
- CloudWatch metrics are not available instantly via the CloudWatch API. You should adjust your collection `delay` to account for this lag in metrics availability based on your [monitoring subscription level](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html)
- CloudWatch API usage incurs cost - see [GetMetricStatistics Pricing](https://aws.amazon.com/cloudwatch/pricing/)
- CloudWatch API usage incurs cost - see [GetMetricData Pricing](https://aws.amazon.com/cloudwatch/pricing/)

### Measurements & Fields:

Expand All @@ -147,7 +165,6 @@ Tag Dimension names are represented in [snake case](https://en.wikipedia.org/wik

- All measurements have the following tags:
- region (CloudWatch Region)
- unit (CloudWatch Metric Unit)
- {dimension-name} (Cloudwatch Dimension value - one for each metric dimension)

### Troubleshooting:
Expand All @@ -168,5 +185,5 @@ aws cloudwatch get-metric-statistics --namespace AWS/EC2 --region us-east-1 --pe

```
$ ./telegraf --config telegraf.conf --input-filter cloudwatch --test
> cloudwatch_aws_elb,load_balancer_name=p-example,region=us-east-1,unit=seconds latency_average=0.004810798017284538,latency_maximum=0.1100282669067383,latency_minimum=0.0006084442138671875,latency_sample_count=4029,latency_sum=19.382705211639404 1459542420000000000
> cloudwatch_aws_elb,load_balancer_name=p-example,region=us-east-1 latency_average=0.004810798017284538,latency_maximum=0.1100282669067383,latency_minimum=0.0006084442138671875,latency_sample_count=4029,latency_sum=19.382705211639404 1459542420000000000
glinton marked this conversation as resolved.
Show resolved Hide resolved
```
Loading