-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External Threshold #3198
Comments
External ThresholdsAgreed, something like this probably makes sense in the long term, unless you want to embed a fully-featured metrics DB like Prometheus inside of k6 😅 Though, given the plethora of time series and metrics databases out there, maybe this can be handled as some sort of an xk6 extension? 🤔 Maybe even an optional feature that some of the already existing built-in outputs and output extensions can expose by implementing an optional interface? After all, k6 will be sending the test run metrics through these outputs to some remote DB, they can also be used to query that same DB back, right? They'd already be configured and authenticated, after all. Though, even that might not be necessary in the short- and mid-term... 🤔 As you've shown with your example, you can already sort of do this with existing k6 functionality. So, with a few JS wrappers and helpers, as well as minor improvements to existing k6 features (e.g. custom exit codes for Another note that I'd bring up here is that thresholds and the end-of-test summary might need to be considered together. They are pretty closely tied together in the current k6 implementation, after all. And while that's not a good reason to continue doing so, it results in some pretty good UX and it might even be easier to consider them together when it comes to implementing this feature too. Also, it'd be tricky how this is handled in the cloud, which doesn't currently support external outputs. That probably needs to be supported before this feature. Distributed ExecutionIn any case, I started replying here to explain why I chose to add support for metrics (both for the end-of-test summary and for thresholds) in my distributed execution PoC. It was simply because it makes for the best user experience to have the same consistent k6 behavior in both local and distributed tests, as much as that is practically possible. And, in this case, it was fairly easy to add support for metrics, so I did... 😅 I will probably publish the full design doc on Monday, but over the last few days I worked hard on splitting #2816 into even smaller self-sufficient and atomic commits. #2816 (comment) contains the details, but now the actual distributed execution (including And the Moreover, the way it's implemented should allow any future improvements to the built-in k6 thresholds to seamlessly apply to distributed test runs too, without any extra complexity! 🎉 🤞 That's because we are literally reusing most of the same code that powers thresholds and the end-of-test summary for local single-instance And, as you say, it's unlikely that the current thresholds will disappear. They might even see improvements even if external thresholds are also adopted. So it makes sense to me to support them in both local and distributed test runs, given the low effort required. |
+1 for this feature request - it would be great to have the ability to query a Prometheus formatted metrics endpoint for threshold evaluation |
I was looking for this feature and got pointed here. I thought I'd mention https://kube-burner.github.io/kube-burner/latest/observability/alerting/ as the sort of thing I was hoping to find in k6. |
Preface:
This is a more of a discussion type of issue and as a reference I can refer users and contributors.
I also didn't link every issue or PR related to this, but will probably do over time.
For example #2816 implements distributed execution and working thresholds. But this is only parts of the things discussed below.
What:
External threshold is the ability of k6 to have thresholds that are evaluated outside of (external to) a k6 instance or even a k6 distributed system.
Think PromQL compatible queries, InfluxDB queries, just normal HTTP request to something that tells you the status on a threshold.
For example whether the moon is in the correct/wrong phase if our test is affect by that.
Why:
Outside data:
A common occurrence is that people want to abort a test based not on things k6 can observer, but things observed by a different system.
Examples:
Expressiveness:
k6 thresholds are not exactly expressive/powerful compared to most other systems dealing with metrics.
Issues:
Without distributed execution baked into k6 (and even with one), one of the main currently not working parts is thresholds.
The primary reason for that is that we have to have the metrics to evaluate thresholds in one place. Which gets harder once you have many instances ... except that a user likely wants all that metrics, so they are putting the somewhere else to visualize later.
In a lot of the cases that will be a time series database of some kind and those have APIs to query them. After all that is more or less the whole point of collecting metrics.
Additionally, it uses a syntax and the features set that a user is (likely) familiar with. If a user is using Prometheus and cortex (for example) they likely have good understanding of PromQL. Making them learn another way to define thresholds instead of letting them just query it directly seems like bad UX. And we can probably integrate PromQL, but then users of other systems will need to learn it instead of something else. And as history has shown - the most beloved system and query language changes over time.
Additionally, k6 will need to grow all the features needed to do all kind of queries and then the optimizations to make them work. And then to teach them to users instead of just them using what they already know with their system. A system that they likely still need to actually make work in order to save and graph all the metrics that they will not have just thresholds on.
Caveats:
This definitely isn't feature targeting small users - especially the ones that want to not have any kind of outputs and just run k6 locally and look at the summary.
The current thresholds will still need to be supported probably forever even if this takes of, and we get really good integrations/libraries.
This also still doesn't remove the need for at least some improvements on the internal thresholds' functionality.
Alternatives:
Arguably all the above needs is some way to execute requests and abort the test.
And k6 is all about making requests, and there is an API to abort the test.
The only problem is that it won't tell the test that it failed due to a threshold. Unless we make a metric and a threshold on it and then use that.
If we had a way to disable metrics emission will also not "pollute" the metrics. . (Doing it on a per scenario or VU basis might be easier and useful in other cases, and make the API a bit easier to think about.)
Example
The example below uses the currently in k6 repo docker-compose setup, which uses Grafana + InfluxDB to store and visualize the metrics.
Running it with
k6 run -o influxdb script.js
will run it with outputting metrics to the InfluxDB.In this case for the simplicity of the setup I get back the metrics that k6 outputs from InfluxDB, through the Grafana API. I could directly do this with InfluxDB I guess. But I also wanted to show you can go through Grafana. So for example if you needed to query multiple backends you can do it in one go through Grafana.
This obviously skips authentication as I didn't want to do that and has a lot of hardcoded values. Also while I could've aborted the script I decided to use the internal threshold and use that to fail.
Additional notes:
The above example will look way better with some more high level API on top of the things below.
Also, any k6 API for work with thresholds will help:
The text was updated successfully, but these errors were encountered: