-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k6 v0.42.0 updates - script fixes (1/5) #21
Conversation
For some reason the server is returning 403s even if the CSRF token is sent as a cookie...
We only have a single IP serving the app, so with http.batch() we would quickly exhaust all ports to the same IP.
It's not needed anymore, since this is effecitively the RPS-optimized script.
]); | ||
} | ||
sleep(0.1); // 100ms sleep between iterations. | ||
http.get('http://test.staging.k6.io/static/logo.svg'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of the curriocity, why we switching from the set of batches to a single request per iteration without sleep? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question 🙂 I should've made that clearer in the commit message.
But basically, we don't have a set of IPs to make requests to as before, and test.staging.k6.io
always resolves to 1 IP. So if we use the same hostname in an http.batch()
call, we would quickly exhaust the ephemeral port range for that single IP. Remember that there's a 5-element tuple for establishing a network socket, and since the source port is the only variable in that case, we quickly reach the 65,536 limit (actually, a bit less than that), and get cannot assign requested address
errors.
One way to avoid this is to make requests from different local IPs, using the --local-ips
option. Unfortunately, I wasn't able to establish connections from EC2 after adding more IPs, probably due to some missing configuration or routing (I was able to make it work on my local machine, though, so not sure what I was missing on EC2). In any case, doing this would complicate the amount of local setup we have to do before this test, and since we want to automate this from GitHub Actions, it's likely we wouldn't be able to make it work there anyway.
The drawback of this is that with http.batch()
k6 can likely make more than the current ~161K RPS with just http.get()
. Pawel was able to reach 188K with v0.26.2, and that was with half as many VUs (30K vs 60K), and a 0.1s sleep per iteration. So we increase the VU amount and remove sleep()
to compensate for not using http.batch()
.
I think the solution should be to make the SUT available on several IPs, by either test.staging.k6.io
returning more than 1 IP, or letting us know the specific IPs we should use, if we want to bypass the load balancer, like we did before. Having a set of fixed IPs is not maintainable though, so resolving to more than 1 IP would be the way to go (even though we'd likely have to use noConnectionReuse
to avoid sticking to a single IP, which would have it's own performance drawbacks, though that's a separate discussion). Unfortunately, I didn't get much help with this from Slava, and they have their own priorities to focus on, so I didn't keep pushing back.
For the purposes of these benchmarks, though, the point is not so much to reach the maximum possible number, but to set a performance baseline that we can use to compare with future results. There will undoubtedly be some variation, depending on the SUT response time, etc., but we should be able to get reliable results to determine if there are any performance changes in k6.
Hopefully that explains it, and let me know if you have any suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great explanation! 🙇
Pawel was able to reach 188K with v0.26.2, and that was with half as many VUs (30K vs 60K), and a 0.1s sleep per iteration.
Iteresting, what are the results for the v0.26.2
on the same (current version) of test script.
For the purposes of these benchmarks, though, the point is not so much to reach the maximum possible number, but to set a performance baseline that we can use to compare with future results.
Yeah, I agree with you that we need the baseline first of all and do the tests regularly. The small concern that I have in my mind is that if the bottleneck currently the SUT, that means that we can potentially catch only degradations, but not the improvements. But as you said we could adress this in the future with tweaking the SUT as you described it in preveous paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, but the SUT doesn't seem to be the bottleneck in this case. It's likely that if we were able to make requests from multiple IPs, we could achieve higher RPS.
The thing is that we shouldn't do any tweaks to the SUT in the future, otherwise it would invalidate some of these results. Ideally, the SUT performance and deployment would remain static, so that the only variation comes from different versions of k6. In practice, this is difficult to achieve, since latencies and response times will fluctuate, even against the same SUT and with the same hardware, which will affect the results. But hopefully the variability is within a certain range that we find useful to determine whether the change comes from k6 or not.
what are the results for the
v0.26.2
on the same (current version) of test script.
Right, that would've been a good test. 👍 I'll run it the next time the SUT is up, and let you know.
In the meantime, please comment about #27 (comment). We need to decide the best strategy to automate this that works for us, that's simple to maintain for the DevOps team, and low cost for the company. I'll invite Slava to comment there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think it might just be better to more or less rerun for some set of versions every time.
This removes the case where the SUT changed or we deployed somethign else slightly differently.
It might be enough to run the previous version to see it gets roughly the same result and if it does - decide the SUT is the same. And not have to rerun for all versions 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is PR 1/5, part of updating the "Running large tests" article and running benchmarks on k6 v0.42.0.
These changes do some script fixes to get them in a usable state for the tests. It's best to review by commit, since the indentation change is noisy.
PR order: #21 -> #22 -> #23 -> #24 -> #25 (review and merge in reverse)