Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug: too many cloudwatch metrics #1885

Merged
merged 5 commits into from
Dec 13, 2016

Conversation

leon-barrett
Copy link
Contributor

Cloudwatch metrics were being added incorrectly. The most obvious
symptom of this was that too many metrics were being added. A simple
check against the name of the metric proved to be a sufficient fix. In
order to test the fix, a metric selection function was factored out.

@leon-barrett
Copy link
Contributor Author

Hi, maintainers. Do you have any feedback about this pull request? We have a workaround for ourselves, but it would be nice to get this fix merged.

@sparrc
Copy link
Contributor

sparrc commented Nov 3, 2016

from what I can tell there appears to be overlap between #1885 and #1903. I would appreciate if one of you could open an issue with a full writeup on what exactly is wrong the cloudwatch plugin.

As this plugin is particularly difficult to reproduce problems on, please be sure to include examples of requests and responses, and links to documentation where necessary.

Please also provide concrete examples that illustrate what is wrong with the current behavior of the plugin.

If I am wrong and there isn't overlap, then feel free to open two separate issues. In either case I think they need to be seen and discussed before I can accept a PR changing the behavior of this plugin.

@johnrengelman
Copy link
Contributor

@sparrc I looked at this change and it looks correct. I verified using the provided spec. What is happening currently is that when using wildcard dimensions, the metric is selected multiple times because it has an inner loop that ranges over all available metrics from the Cloudwatch API but it doesn't compare that it's looking at the same metric that's configured.

So if I have this:

selectedMetrics = ["Connections"]

and Cloudwatch is returning:

availableMetrics = ["Connections", "Disk", "CPU"]

then the code is currently doing:

for _, sm := range selectedMetrics {
  for _, am := range availableMetrics {
    //if the available metric has the same possible dimensions as the metric being looked at, 
    //then it's appended.
    if (isSelected(am, dimension)) { 
      append(sm)
    }
  }
}

So the change here is correct in that it first checks that we are comparing our selected metric against only those metrics in cloudwatch of the same name.

}
for _, name := range m.MetricNames {
for _, metric := range allMetrics {
if isSelected(metric, m.Dimensions) {
if name == *metric.MetricName && isSelected(metric, m.Dimensions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name logic check should just be folded into the isSelected method.
So, isSelected(name, metric, m.Dimensions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sparrc
Copy link
Contributor

sparrc commented Nov 5, 2016

OK, fair enough, thanks for the review @johnrengelman

@sparrc sparrc reopened this Nov 5, 2016
@leon-barrett
Copy link
Contributor Author

Thanks for taking a close look at my fix. I'll try to be more complete in explanations in future pull requests.

@johnrengelman johnrengelman mentioned this pull request Nov 8, 2016
2 tasks
@sparrc sparrc added this to the 1.2.0 milestone Nov 25, 2016
Copy link
Contributor

@sparrc sparrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just update the changelog and I'll get it merged

@@ -55,6 +55,7 @@ continue sending logs to /var/log/telegraf/telegraf.log.

### Bugfixes

- [#1885](https://github.com/influxdata/telegraf/pull/1885): Fix over-querying of cloudwatch metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this into "features" under the 1.2 section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Cloudwatch metrics were being added incorrectly. The most obvious
symptom of this was that too many metrics were being added. A simple
check against the name of the metric proved to be a sufficient fix. In
order to test the fix, a metric selection function was factored out.
@sparrc sparrc merged commit 6e24161 into influxdata:master Dec 13, 2016
njwhite pushed a commit to njwhite/telegraf that referenced this pull request Jan 31, 2017
* Fix bug: too many cloudwatch metrics

Cloudwatch metrics were being added incorrectly. The most obvious
symptom of this was that too many metrics were being added. A simple
check against the name of the metric proved to be a sufficient fix. In
order to test the fix, a metric selection function was factored out.

* Go fmt cloudwatch

* Cloudwatch isSelected checks metric name

* Move cloudwatch line in changelog to 1.2 features
maxunt pushed a commit that referenced this pull request Jun 26, 2018
* Fix bug: too many cloudwatch metrics

Cloudwatch metrics were being added incorrectly. The most obvious
symptom of this was that too many metrics were being added. A simple
check against the name of the metric proved to be a sufficient fix. In
order to test the fix, a metric selection function was factored out.

* Go fmt cloudwatch

* Cloudwatch isSelected checks metric name

* Move cloudwatch line in changelog to 1.2 features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants