Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/cloudwatch #1032

Closed
wants to merge 22 commits into from
Closed

Feature/cloudwatch #1032

wants to merge 22 commits into from

Conversation

j-velasco
Copy link

This PR aims to add CloudWatch support to report metrics.

I'm aware that there is user documentation missing and I'm planning to work on it on the following days. I'm opening this PR early to get feedback on the code and make any adjustment needed.

I added some tests, but those not cover the interaction with CloudWatch. I couldn't find similar tests for the other integrations, and I thought that maybe these are not required. One alternative that I considered to test it is use something like localstack, but I didn't as seem to be a significant dependency to add to docker-compose. Another option could be to intercept requests in the AWS client (I'm pretty sure I have done this on the past with S3). Any guide or preference on which test to add? In the project that I'm currently working at my job, we're using this for a couple of days, and it is working fine, so I have a decent level of confidence on the right functionality.

Documentation-wise, I'm thinking of using this PR as reference. Do you have any specific thing that I should cover? Any other guidance.

Cheers.

@CLAassistant
Copy link

CLAassistant commented May 29, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ Jorge Velasco
❌ j-velasco


Jorge Velasco seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

cmd/collectors.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/cloudwatch_test.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector_test.go Outdated Show resolved Hide resolved
stats/awscloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/collector.go Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 29, 2019

Codecov Report

Merging #1032 into master will decrease coverage by 0.07%.
The diff coverage is 64.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1032      +/-   ##
==========================================
- Coverage   72.67%   72.59%   -0.08%     
==========================================
  Files         133      135       +2     
  Lines        9846     9945      +99     
==========================================
+ Hits         7156     7220      +64     
- Misses       2274     2305      +31     
- Partials      416      420       +4
Impacted Files Coverage Δ
cmd/collectors.go 0% <0%> (ø) ⬆️
stats/awscloudwatch/cloudwatch.go 60% <60%> (ø)
stats/awscloudwatch/collector.go 73.8% <73.8%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 607b63c...0e46b5a. Read the comment docs.

@na--
Copy link
Member

na-- commented Jun 10, 2019

Many thanks for implementing this pull request and a big apology for the huge, huge delay in responding! I though that I'd posted a comment here last week, but apparently I didn't properly click on the big green button or something 🤦‍♂️ 🤕...

I'll take a look at the actual code tomorrow or the day after, but I'll try to give a preliminary answer to the questions your raised in the PR description.

Regarding testing the collector - pulling localstack is not a very good option at this time, since it's a python project. Eventually, we'd probably want to have end-to-end tests (in a docker-compose environment or similar), but for now we don't have anything like that set up, so we're restricted to Go unit tests.

That said, the Go AWS SDK appears to have some mocking functionality (blog post, code), so maybe testing the code could happen through that? Though I may very well be wrong, since I haven't delved in the PR code very deeply and I haven't used the Go AWS SDK before...

Regarding the documentation, I see that you've added the new output to the release notes, so that seems sufficient for now. We have a page in the documentation about all of the outputs k6 supports, but since it's already overcrowded and we want to improve and split it up into multiple pages soon (#743 (comment)), we'll probably just copy-paste the release notes there for now.

@j-velasco
Copy link
Author

Hi @na--, thanks for your input.
I agree with you on the way to tests, I'll follow that path.

I've been busy and haven't done any progress on this but I hope to manage to get the integration with AWS covered over the next few days.

Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and sorry again for the long time it took to review... I've requested some changes inline, but other things in the code point me to the fact that the k6 lib.Collector interface is a somewhat poor fit for a lot of outputs like this one. That probably deserves a separate issue on its own though...

stats/awscloudwatch/cloudwatch.go Outdated Show resolved Hide resolved

// newCloudWatchClient creates a new CloudWatch client
func newCloudWatchClient() (*cloudwatch.CloudWatch, error) {
sess, err := session.NewSession()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the docs for creating an AWS session (https://godoc.org/github.com/aws/aws-sdk-go/aws/session#hdr-Creating_Sessions), this would load the AWS credentials from ~/.aws/credentials or use AWS_* environment variables? I've only skimmed the SDK docs, but that seems a bit messy to me... 😕

Ideally, this output shouldn't rely on the presence of files on the filesystem or read environment variables directly from the OS. For one, that makes it harder to test, but my main concern is that this could get in the way of the future native distributed execution support we want to introduce in k6... Also, the environment variables here aren't prefixed with K6_, like the other config options are. But from skimming the AWS SDK docs, I'm not sure we can forbid the SDK code from actually doing all of that magic 😕

If the k6 output configuration situation was otherwise perfect (and it's unfortunately not - #587, #883 😞 ), I would force the issue, but I'm not sure if we shouldn't accept this as-is and deprecate the direct options in the future? @mstoykov, thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way that I see to avoid this issue is duplicate the private newSession. I know this is not an ideal solution, but at the same time I don't see a big problem neither. Looking at what NewSessionWithOptions does, for instance, seems that most of the functionality is around more flexible configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of loading things automatically without prefixes ... Not certain if copying the newSession will be enough, but if it is ... maybe that is good idea.

On a related note (of copying code) ... I am under the impression that implementing the whole PutMetric call in k6 instead of using the aws-sdk will not be ... all that much code given that it is just http request.

stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/cloudwatch.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved
stats/awscloudwatch/collector.go Outdated Show resolved Hide resolved

if value != "" {
dims = append(dims, &cloudwatch.Dimension{
Name: aws.String(name),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very strange... 😕 This just returns a pointer to the value and I have no idea why AWS needs it...

@j-velasco
Copy link
Author

Hi again @na--,
I just added a test for the integration with CloudWatch (unit mocking real client as you suggested). For some reason, tests are failing in CircleCI, but I don't see why, could you advise me or try to re-run the job?
Let me know if more changes are needed, I'll be happy to do them.

@na--
Copy link
Member

na-- commented Jun 27, 2019

I'm sorry for the delay... again... 😞 Thanks for the changes! The CircleCI checks seem fine now, but for some reason I can't re-run the GolangCI one... 😕 It seems like there was some network issue in the previous run, so probably a new commit or a force push would resolve it - I don't see a way to manually do it...

Also, I think that I will need to further familiarize myself with the CloudWatch API and the SDK configuration, because I'm starting to get hesitant about this PR. We'd be merging almost 35000 (roughly equivalent to the current non-vendor k6 code!) lines of code in the k6 repo, in order to basically make a single HTTP request to the AWS API with a somewhat fancy (but not complicated) authentication.

That, combined with the mismatch between the SDK and k6 configuration I mentioned previously, makes me wonder if this can't be implemented in a much more simplified (i.e. a few hundred lines of code) and more easily-maintainable and testable format 😕

Something else that's a bit concerning to me... k6 generates a lot of metrics and CloudWatch seems to have very low restrictions of its PutMetricData API. You have const maxMetricsPerCall = 20 in your code and the API docs I linked above mention:

Each PutMetricData request is limited to 40 KB in size for HTTP POST requests. You can send a payload compressed by gzip. Each request is also limited to no more than 20 different metrics.

So, given that k6 basically emits 9 metrics for a single HTTP request (and others for each iteration, group(), check(), etc.), it doesn't have metric aggregation for all outputs (#1060), and you can't filter out metrics yet (#570), how usable is the CloudWatch output? What's your use case?

@na--
Copy link
Member

na-- commented Jun 27, 2019

As an addendum, I raised the last point about the CloudWatch API limits because it seems to me that we'd need to make roughly 1 HTTP request to the CloudWatch API for every 2 k6 HTTP requests that we make as part of the actual load test script... This would almost surely greatly skew the test results, and is probably not how the CloudWatch API is meant to be used, if they have such low restrictions 😕

Since we don't yet have universal aggregation or filtering of the k6 of metrics, I wonder if something like telegraf can't be used to bridge the gap? They seem to have metric aggregation and maybe filtering and they already have a CloudWatch output plugin...

@rossmckelvie
Copy link

Since this PR last left off, there's a new feature that might help get around the rate limitations of put metric. Any thought put into using Embedded Metric Format?

@j-velasco j-velasco closed this Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants