Skip to content

Commit

Permalink
Push events even when there's no cloudwatch data (#11023) (#11136)
Browse files Browse the repository at this point in the history
* Push events even when there's no cloudwatch data
* Change cloud.provider from ec2 to aws
(cherry picked from commit 6eab0bd)
  • Loading branch information
kaiyan-sheng authored Mar 11, 2019
1 parent 4191881 commit 715ab16
Show file tree
Hide file tree
Showing 8 changed files with 126 additions and 51 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ https://github.com/elastic/beats/compare/v7.0.0-beta1...master[Check the HEAD di
- Migrate docker autodiscover to ECS. {issue}10757[10757] {pull}10862[10862]
- Fix issue in kubernetes module preventing usage percentages to be properly calculated. {pull}10946[10946]
- Fix parsing error using GET in Jolokia module. {pull}11075[11075] {issue}11071[11071]
- Collect metrics when EC2 instances are not in running state. {issue}11008[11008] {pull}11023[11023]
- Change ECS field cloud.provider to aws. {pull}11023[11023]

*Packetbeat*

Expand Down
Binary file modified metricbeat/docs/images/metricbeat-aws-ec2-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 31 additions & 10 deletions metricbeat/docs/modules/aws.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ This file is generated! See scripts/docs_collector.py
== aws module

This module periodically fetches monitoring metrics from AWS Cloudwatch using
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html[GetMetricData API] for running
EC2 instances. Note: extra AWS charges on GetMetricData API requests will be generated by this module.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html[GetMetricData API] for AWS services.
Note: extra AWS charges on GetMetricData API requests will be generated by this module.

The default metricset is `ec2`.

Expand All @@ -18,11 +18,10 @@ This module environment variable `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `
references in the config file to set values that need to be configurable during deployment.

There are two different kinds of AWS credentials can be used here: `access keys` and `temporary security credentials`.
`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are the two parts of `access keys` for AWS to authenticate AWS API requests.
`access keys` are long-term credentials for an IAM user or the AWS account root user. Please see
`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are the two parts of `access keys`. They are long-term credentials for
an IAM user or the AWS account root user. Please see
https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys[AWS Access Keys
and Secret Access Keys] for more details. A more AWS recommended way is to use
`temporary security credentials` instead of `access keys`. `temporary security credentials` consist of an access key ID,
and Secret Access Keys] for more details. `temporary security credentials` has a limited lifetime and consists of an access key ID,
a secret access key, and a security token which typically returned from `GetSessionToken`. MFA-enabled IAM users would
need to submit an MFA code while calling `GetSessionToken`. `aws_default_region` is to set the region for SDK to use. Please
see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html[Temporary Security Credentials] for more details.
Expand All @@ -32,8 +31,14 @@ see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html[Te
aws> sts get-session-token --serial-number arn:aws:iam::1234:mfa/your-email@example.com --token-code 456789 --duration-seconds 129600
----

Specific permissions needs to be added into the IAM user's policy to authorize Metricbeat to collect AWS monitoring metrics. Please
see documentation under each metricset for required permissions.
Since temporary security credentials are short term, after they expire, the user needs to generate new ones and modify
the aws.yml config file with the new credentials. This will cause data loss if the config file is not update with new
credentials before the old ones expire. For Metricbeat, we recommend users to use access keys in config file to enable
aws module making AWS api calls without have to generate new temporary credentials and update the config frequently.

IAM policy is an entity that defines permissions to an object within your AWS environment. Specific permissions needs
to be added into the IAM user's policy to authorize Metricbeat to collect AWS monitoring metrics. Please see documentation
under each metricset for required permissions.

By default, Amazon EC2 sends metric data to CloudWatch every 5 minutes. With this basic monitoring, `period` in aws module
configuration should be larger or equal than `300s`. If `period` is set to be less than `300s`, the same cloudwatch metrics
Expand Down Expand Up @@ -63,8 +68,24 @@ metricbeat.modules:
default_region: '${AWS_REGION:us-west-1}'
----

This module only collects metrics for EC2 instances that are in `running` state and exist more than 10 minutes to make sure
there are monitoring metrics exist in Cloudwatch already.
[float]
== Metricsets

The following Metricsets are already included:

[float]
=== `ec2`
By default, Amazon EC2 sends metric data to CloudWatch every 5 minutes. With this basic monitoring, `period` in aws module
configuration should be larger or equal than `300s`. If `period` is set to be less than `300s`, the same cloudwatch metrics
will be collected more than once which will cause extra fees without getting more granular metrics. For example, in `US East (N. Virginia)` region, it costs
$0.01/1000 metrics requested using GetMetricData. Please see https://aws.amazon.com/cloudwatch/pricing/[AWS Cloudwatch Pricing]
for more details. To avoid unnecessary charges, `period` is preferred to be set to `300s` or multiples of `300s`, such as
`600s` and `900s`. For more granular monitoring data you can enable detailed monitoring on the instance to get metrics every 1 minute. Please see
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html[Enabling Detailed Monitoring] for instructions
on how to enable detailed monitoring. With detailed monitoring enabled, `period` in aws module configuration can be any number
larger than `60s`. Since AWS sends metric data to CloudWatch in 1-minute periods, setting metricbeat module `period` less
than `60s` will cause extra API requests which means extra charges on AWS. To avoid unnecessary charges, `period` is
preferred to be set to `60s` or multiples of `60s`, such as `120s` and `180s`.

The AWS module comes with a predefined dashboard. For example:

Expand Down
41 changes: 31 additions & 10 deletions x-pack/metricbeat/module/aws/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
This module periodically fetches monitoring metrics from AWS Cloudwatch using
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html[GetMetricData API] for running
EC2 instances. Note: extra AWS charges on GetMetricData API requests will be generated by this module.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html[GetMetricData API] for AWS services.
Note: extra AWS charges on GetMetricData API requests will be generated by this module.

The default metricset is `ec2`.

Expand All @@ -11,11 +11,10 @@ This module environment variable `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `
references in the config file to set values that need to be configurable during deployment.

There are two different kinds of AWS credentials can be used here: `access keys` and `temporary security credentials`.
`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are the two parts of `access keys` for AWS to authenticate AWS API requests.
`access keys` are long-term credentials for an IAM user or the AWS account root user. Please see
`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are the two parts of `access keys`. They are long-term credentials for
an IAM user or the AWS account root user. Please see
https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys[AWS Access Keys
and Secret Access Keys] for more details. A more AWS recommended way is to use
`temporary security credentials` instead of `access keys`. `temporary security credentials` consist of an access key ID,
and Secret Access Keys] for more details. `temporary security credentials` has a limited lifetime and consists of an access key ID,
a secret access key, and a security token which typically returned from `GetSessionToken`. MFA-enabled IAM users would
need to submit an MFA code while calling `GetSessionToken`. `aws_default_region` is to set the region for SDK to use. Please
see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html[Temporary Security Credentials] for more details.
Expand All @@ -25,8 +24,14 @@ see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html[Te
aws> sts get-session-token --serial-number arn:aws:iam::1234:mfa/your-email@example.com --token-code 456789 --duration-seconds 129600
----

Specific permissions needs to be added into the IAM user's policy to authorize Metricbeat to collect AWS monitoring metrics. Please
see documentation under each metricset for required permissions.
Since temporary security credentials are short term, after they expire, the user needs to generate new ones and modify
the aws.yml config file with the new credentials. This will cause data loss if the config file is not update with new
credentials before the old ones expire. For Metricbeat, we recommend users to use access keys in config file to enable
aws module making AWS api calls without have to generate new temporary credentials and update the config frequently.

IAM policy is an entity that defines permissions to an object within your AWS environment. Specific permissions needs
to be added into the IAM user's policy to authorize Metricbeat to collect AWS monitoring metrics. Please see documentation
under each metricset for required permissions.

By default, Amazon EC2 sends metric data to CloudWatch every 5 minutes. With this basic monitoring, `period` in aws module
configuration should be larger or equal than `300s`. If `period` is set to be less than `300s`, the same cloudwatch metrics
Expand Down Expand Up @@ -56,8 +61,24 @@ metricbeat.modules:
default_region: '${AWS_REGION:us-west-1}'
----

This module only collects metrics for EC2 instances that are in `running` state and exist more than 10 minutes to make sure
there are monitoring metrics exist in Cloudwatch already.
[float]
== Metricsets

The following Metricsets are already included:

[float]
=== `ec2`
By default, Amazon EC2 sends metric data to CloudWatch every 5 minutes. With this basic monitoring, `period` in aws module
configuration should be larger or equal than `300s`. If `period` is set to be less than `300s`, the same cloudwatch metrics
will be collected more than once which will cause extra fees without getting more granular metrics. For example, in `US East (N. Virginia)` region, it costs
$0.01/1000 metrics requested using GetMetricData. Please see https://aws.amazon.com/cloudwatch/pricing/[AWS Cloudwatch Pricing]
for more details. To avoid unnecessary charges, `period` is preferred to be set to `300s` or multiples of `300s`, such as
`600s` and `900s`. For more granular monitoring data you can enable detailed monitoring on the instance to get metrics every 1 minute. Please see
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html[Enabling Detailed Monitoring] for instructions
on how to enable detailed monitoring. With detailed monitoring enabled, `period` in aws module configuration can be any number
larger than `60s`. Since AWS sends metric data to CloudWatch in 1-minute periods, setting metricbeat module `period` less
than `60s` will cause extra API requests which means extra charges on AWS. To avoid unnecessary charges, `period` is
preferred to be set to `60s` or multiples of `60s`, such as `120s` and `180s`.

The AWS module comes with a predefined dashboard. For example:

Expand Down
36 changes: 18 additions & 18 deletions x-pack/metricbeat/module/aws/ec2/_meta/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
"aws": {
"ec2": {
"cpu": {
"credit_balance": 576,
"credit_usage": 0.144723,
"credit_balance": 144,
"credit_usage": 0.001823,
"surplus_credit_balance": 0,
"surplus_credits_charged": 0,
"total": {
"pct": 1.366194313233248
"pct": 0.033333333333303
}
},
"diskio": {
Expand All @@ -27,21 +27,21 @@
},
"instance": {
"core": {
"count": 2
"count": 1
},
"image": {
"id": "ami-f920cd94"
"id": "ami-05b3bcf7f311194b3"
},
"monitoring": {
"state": "disabled"
},
"private": {
"dns_name": "ip-10-0-0-148.ec2.internal",
"ip": "10.0.0.148"
"dns_name": "ip-172-31-10-23.ap-southeast-1.compute.internal",
"ip": "172.31.10.23"
},
"public": {
"dns_name": "ec2-54-226-109-162.compute-1.amazonaws.com",
"ip": "54.226.109.162"
"dns_name": "ec2-18-136-198-93.ap-southeast-1.compute.amazonaws.com",
"ip": "18.136.198.93"
},
"state": {
"code": 16,
Expand All @@ -51,12 +51,12 @@
},
"network": {
"in": {
"bytes": 737000.4,
"packets": 1361.2
"bytes": 56,
"packets": 1
},
"out": {
"bytes": 227871.2,
"packets": 1411.2
"bytes": 88,
"packets": 1.6
}
},
"status": {
Expand All @@ -67,15 +67,15 @@
}
},
"cloud": {
"availability_zone": "us-east-1b",
"availability_zone": "ap-southeast-1b",
"instance": {
"id": "i-77f84332"
"id": "i-0c68eeb552231a8d0"
},
"machine": {
"type": "t2.medium"
"type": "t2.micro"
},
"provider": "ec2",
"region": "us-east-1"
"provider": "aws",
"region": "ap-southeast-1"
},
"event": {
"dataset": "aws.ec2",
Expand Down
30 changes: 30 additions & 0 deletions x-pack/metricbeat/module/aws/ec2/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,35 @@ The ec2 metricset of aws module allows you to monitor your AWS EC2 instances,
including `cpu`, `network`, `disk` and `status`. `ec2` metricset fetches a set of values from
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html#ec2-cloudwatch-metrics[Cloudwatch AWS EC2 Metrics].

We fetch the following data:

* *cpu.total.pct*: The percentage of allocated EC2 compute units that are currently in use on the instance.
* *cpu.credit_usage*: The number of CPU credits spent by the instance for CPU utilization.
* *cpu.credit_balance*: The number of earned CPU credits that an instance has accrued since it was launched or started.
* *cpu.surplus_credit_balance*: The number of surplus credits that have been spent by an unlimited instance when its CPUCreditBalance value is zero.
* *cpu.surplus_credits_charged*: The number of spent surplus credits that are not paid down by earned CPU credits, and which thus incur an additional charge.
* *network.in.packets*: The number of packets received on all network interfaces by the instance.
* *network.out.packets*: The number of packets sent out on all network interfaces by the instance.
* *network.in.bytes*: The number of bytes received on all network interfaces by the instance.
* *network.out.bytes*: The number of bytes sent out on all network interfaces by the instance.
* *diskio.read.bytes*: Bytes read from all instance store volumes available to the instance.
* *diskio.write.bytes*: Bytes written to all instance store volumes available to the instance.
* *diskio.read.ops*: Completed read operations from all instance store volumes available to the instance in a specified period of time.
* *diskio.write.ops*: Completed write operations to all instance store volumes available to the instance in a specified period of time.
* *status.check_failed*: Reports whether the instance has passed both the instance status check and the system status check in the last minute.
* *status.check_failed_system*: Reports whether the instance has passed the system status check in the last minute.
* *status.check_failed_instance*: Reports whether the instance has passed the instance status check in the last minute.
* *instance.core.count*: The number of CPU cores for the instance.
* *instance.image.id*: The ID of the image used to launch the instance.
* *instance.monitoring.state*: Indicates whether detailed monitoring is enabled.
* *instance.private.dns_name*: The private DNS name of the network interface.
* *instance.private.ip*: The private IPv4 address associated with the network interface.
* *instance.public.dns_name*: The public DNS name of the instance.
* *instance.public.ip*: The address of the Elastic IP address (IPv4) bound to the network interface.
* *instance.state.code*: The state of the instance, as a 16-bit unsigned integer.
* *instance.threads_per_core*: The state of the instance (pending | running | shutting-down | terminated | stopping | stopped).
[float]
=== AWS Permissions
Some specific AWS permissions are required for IAM user to collect AWS EC2 metrics.
----
Expand All @@ -10,6 +39,7 @@ cloudwatch:GetMetricData
ec2:DescribeRegions
----

[float]
=== Dashboard

The aws ec2 metricset comes with a predefined dashboard. For example:
Expand Down
25 changes: 13 additions & 12 deletions x-pack/metricbeat/module/aws/ec2/ec2.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,17 +104,18 @@ func (m *MetricSet) Fetch(report mb.ReporterV2) {

for _, instanceID := range instanceIDs {
metricDataQueries := constructMetricQueries(listMetricsOutput, instanceID, m.PeriodInSec)
if len(metricDataQueries) == 0 {
continue
}

// Use metricDataQueries to make GetMetricData API calls
metricDataOutput, err := aws.GetMetricDataResults(metricDataQueries, svcCloudwatch, startTime, endTime)
if err != nil {
err = errors.Wrap(err, "GetMetricDataResults failed, skipping region "+regionName+" for instance "+instanceID)
m.logger.Error(err.Error())
report.Error(err)
continue
// If metricDataQueries, still needs to createCloudWatchEvents.
metricDataOutput := []cloudwatch.MetricDataResult{}
if len(metricDataQueries) != 0 {
// Use metricDataQueries to make GetMetricData API calls
metricDataOutput, err = aws.GetMetricDataResults(metricDataQueries, svcCloudwatch, startTime, endTime)
if err != nil {
err = errors.Wrap(err, "GetMetricDataResults failed, skipping region "+regionName+" for instance "+instanceID)
m.logger.Error(err.Error())
report.Error(err)
continue
}
}

// Create Cloudwatch Events for EC2
Expand Down Expand Up @@ -157,7 +158,7 @@ func createCloudWatchEvents(getMetricDataResults []cloudwatch.MetricDataResult,
}

event.RootFields.Put("service.name", metricsetName)
event.RootFields.Put("cloud.provider", metricsetName)
event.RootFields.Put("cloud.provider", "aws")
event.RootFields.Put("cloud.availability_zone", *instanceOutput.Placement.AvailabilityZone)
event.RootFields.Put("cloud.region", regionName)
event.RootFields.Put("cloud.instance.id", instanceID)
Expand Down Expand Up @@ -244,7 +245,7 @@ func getInstancesPerRegion(svc ec2iface.EC2API) (instanceIDs []string, instances
func createMetricDataQuery(metric cloudwatch.Metric, instanceID string, index int, periodInSec int) (metricDataQuery cloudwatch.MetricDataQuery) {
statistic := "Average"
period := int64(periodInSec)
id := "e" + strconv.Itoa(index)
id := "ec2" + strconv.Itoa(index)
metricDims := metric.Dimensions

for _, dim := range metricDims {
Expand Down
2 changes: 1 addition & 1 deletion x-pack/metricbeat/module/aws/ec2/ec2_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ func TestCreateCloudWatchEvents(t *testing.T) {
"service": common.MapStr{"name": "ec2"},
"cloud": common.MapStr{
"region": regionName,
"provider": "ec2",
"provider": "aws",
"instance": common.MapStr{"id": "i-123"},
"machine": common.MapStr{"type": "t2.medium"},
"availability_zone": "us-west-1a",
Expand Down

0 comments on commit 715ab16

Please sign in to comment.