Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k6 v0.42.0 updates - script fixes (1/5) #21

Merged
merged 10 commits into from
Feb 10, 2023
Merged

Conversation

imiric
Copy link
Contributor

@imiric imiric commented Feb 6, 2023

This is PR 1/5, part of updating the "Running large tests" article and running benchmarks on k6 v0.42.0.

These changes do some script fixes to get them in a usable state for the tests. It's best to review by commit, since the indentation change is noisy.

PR order: #21 -> #22 -> #23 -> #24 -> #25 (review and merge in reverse)

Ivan Mirić added 10 commits February 6, 2023 18:44
For some reason the server is returning 403s even if the CSRF token is
sent as a cookie...
We only have a single IP serving the app, so with http.batch() we would
quickly exhaust all ports to the same IP.
It's not needed anymore, since this is effecitively the RPS-optimized
script.
]);
}
sleep(0.1); // 100ms sleep between iterations.
http.get('http://test.staging.k6.io/static/logo.svg');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the curriocity, why we switching from the set of batches to a single request per iteration without sleep? 🤔

Copy link
Contributor Author

@imiric imiric Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question 🙂 I should've made that clearer in the commit message.

But basically, we don't have a set of IPs to make requests to as before, and test.staging.k6.io always resolves to 1 IP. So if we use the same hostname in an http.batch() call, we would quickly exhaust the ephemeral port range for that single IP. Remember that there's a 5-element tuple for establishing a network socket, and since the source port is the only variable in that case, we quickly reach the 65,536 limit (actually, a bit less than that), and get cannot assign requested address errors.

One way to avoid this is to make requests from different local IPs, using the --local-ips option. Unfortunately, I wasn't able to establish connections from EC2 after adding more IPs, probably due to some missing configuration or routing (I was able to make it work on my local machine, though, so not sure what I was missing on EC2). In any case, doing this would complicate the amount of local setup we have to do before this test, and since we want to automate this from GitHub Actions, it's likely we wouldn't be able to make it work there anyway.

The drawback of this is that with http.batch() k6 can likely make more than the current ~161K RPS with just http.get(). Pawel was able to reach 188K with v0.26.2, and that was with half as many VUs (30K vs 60K), and a 0.1s sleep per iteration. So we increase the VU amount and remove sleep() to compensate for not using http.batch().

I think the solution should be to make the SUT available on several IPs, by either test.staging.k6.io returning more than 1 IP, or letting us know the specific IPs we should use, if we want to bypass the load balancer, like we did before. Having a set of fixed IPs is not maintainable though, so resolving to more than 1 IP would be the way to go (even though we'd likely have to use noConnectionReuse to avoid sticking to a single IP, which would have it's own performance drawbacks, though that's a separate discussion). Unfortunately, I didn't get much help with this from Slava, and they have their own priorities to focus on, so I didn't keep pushing back.

For the purposes of these benchmarks, though, the point is not so much to reach the maximum possible number, but to set a performance baseline that we can use to compare with future results. There will undoubtedly be some variation, depending on the SUT response time, etc., but we should be able to get reliable results to determine if there are any performance changes in k6.

Hopefully that explains it, and let me know if you have any suggestions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great explanation! 🙇

Pawel was able to reach 188K with v0.26.2, and that was with half as many VUs (30K vs 60K), and a 0.1s sleep per iteration.

Iteresting, what are the results for the v0.26.2 on the same (current version) of test script.

For the purposes of these benchmarks, though, the point is not so much to reach the maximum possible number, but to set a performance baseline that we can use to compare with future results.

Yeah, I agree with you that we need the baseline first of all and do the tests regularly. The small concern that I have in my mind is that if the bottleneck currently the SUT, that means that we can potentially catch only degradations, but not the improvements. But as you said we could adress this in the future with tweaking the SUT as you described it in preveous paragraph.

Copy link
Contributor Author

@imiric imiric Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but the SUT doesn't seem to be the bottleneck in this case. It's likely that if we were able to make requests from multiple IPs, we could achieve higher RPS.

The thing is that we shouldn't do any tweaks to the SUT in the future, otherwise it would invalidate some of these results. Ideally, the SUT performance and deployment would remain static, so that the only variation comes from different versions of k6. In practice, this is difficult to achieve, since latencies and response times will fluctuate, even against the same SUT and with the same hardware, which will affect the results. But hopefully the variability is within a certain range that we find useful to determine whether the change comes from k6 or not.

what are the results for the v0.26.2 on the same (current version) of test script.

Right, that would've been a good test. 👍 I'll run it the next time the SUT is up, and let you know.

In the meantime, please comment about #27 (comment). We need to decide the best strategy to automate this that works for us, that's simple to maintain for the DevOps team, and low cost for the company. I'll invite Slava to comment there as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it might just be better to more or less rerun for some set of versions every time.

This removes the case where the SUT changed or we deployed somethign else slightly differently.

It might be enough to run the previous version to see it gets roughly the same result and if it does - decide the SUT is the same. And not have to rerun for all versions 🤔

scripts/website.js Show resolved Hide resolved
scripts/website.js Show resolved Hide resolved
Copy link

@olegbespalov olegbespalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link

@codebien codebien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@imiric imiric merged commit b822eba into master Feb 10, 2023
@imiric imiric deleted the update/k6-v0.42.0-fixes branch February 10, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants