-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use HDR histograms for calculating percentiles in thresholds and summary stats #763
Comments
Just noticed that the go-metrics library proposed in #429 is a Go port this Java metrics library, the original author of which was Coda Hale - same person who authored the dead Go library for HdrHistogram linked above. Some further investigation of that metrics library (and the topic in general) may offer other benefits, so it may be worth it do deal with both this issue and #429 at the same time. |
@mstoykov found an active fork of the original HDR histogram repo: https://github.com/kanosaki/hdrhistogram |
I've looked at the HDR histogram and go-metrics libraries, but it seems that their implementations of a histogram can only store int64 values, which isn't appropriate for Trend metrics. |
Hm that's a good point 😕 I think all of the internal k6 It seems to me like the choices are:
And in all these scenarios, but especially in the first 2 ones, it's not clear what we should do with custom user-defined |
What if we implement HDR histogram only for time based trend metrics? |
Yeah, that would probably work in most cases, though we don't have a guarantee that the times users choose would be close to 0. The alternative would probably be to expose the implementation a bit and add a separate parameter if the sink for the |
Or we can use histogram with exponentially decaying samples from go-metrics package, provided all the values will be int64. This type of histogram doesn't depend on what values are stored in there and it's much easier to rewrite if we need to support float64. |
#1064 (comment) pointed to another approach that deserves some investigation, before we start implementing things: https://github.com/tdunning/t-digest Go versions: https://github.com/spenczar/tdigest, https://github.com/influxdata/tdigest |
I'm nobody in particular, but this would be really cool. I've been using HDR Histograms to provide a consistent interface over different load-testing tools output stats/metrics to be able to chart them together coherently (e.g. You're able to get a really high degree of information density, and HDR Histogram has Currently to do this, I have to write the stdout JSONL logs to a file, create a line-reader, parse them, and then post-run build up the histogram from the logs: Gist to avoid spamming thread with unneccessary code: |
Any movement on this? I'm sad that we don't get a complete histogram for timings. k6 produces some random percentiles (0, 50, 90, 95, 100; why not 99?), but that's not sufficent to draw a complete latency chart. We've been using hdrhistogram-go for a project, and it seems mature enough to use in k6. |
@atombender you can specify whatever percentile you want.
|
@Sirozha1337 That option causes k6 to print percentiles, but they don't end up in the file specified with
|
@atombender That issue is being tracked in #1611, and a fix will likely land in v0.30.0, planned for mid-January. No updates yet for this issue as others have taken up higher priority. Most of the team is on vacation right now, but I'll discuss making this a priority for the upcoming releases. |
The Go HDR histogram repo seems to have been moved and somewhat revived at https://github.com/HdrHistogram/hdrhistogram-go However, it seems like it might be better to potentially go with another library, https://github.com/openhistogram/circonusllhist I haven't read the paper that compares it with other histogram implementations (incl. HDR histograms) yet, just watched this YouTube presentation from the authors, it definitely deserves some investigation. |
Currently the
Trend
-based threshold checks and end-of-test summary stats rely on saving all of the relevant metric values in-memory. This is basically a large memory leak that can negatively affect long-running and/or HTTP-heavy load tests, as reported by this user on slack.For the moment the best solution for solving those issues without any loss of functionality or and only a very tiny loss of precision appears to be HdrHistogram. Here's an excerpt from the description in the above website:
This is the original Java library by Gil Tene. He also has some great talks (this and its earlier version for example) that explain some of the common pitfalls when people measure latency and why he built HdrHistogram. They're also a very strong argument why we should prioritize the arrival-rate based VU executor... 😄
This is an MIT-licensed Go implementation of HdrHistogram, though it seems to be dead - archived repo with no recent commits and unresolved issues and PRs. So we may need to fork that repo and maintain it or and re-implement the algorithm ourselves.
Another thing that HdrHistogram may help with is exposing summary stats in the
teardown()
function or outputting them in a JSON file at the end. This is something that a lot of users have requested - #647 and #351 and somewhat #355.Most of the difficulty there lies in exposing the raw data to the JS runtime (and with HdrHistogram we can expose its API), and especially with implementing the stats calculation in the distributed execution environment (the current Load Impact cloud or the future native k6 cluster execution). Having trend metrics backed by HdrHistogram should allow us to avoid the need to schlep all of the raw metrics data between k6 instances (or require an external DB) at the end of a distributed test...
The text was updated successfully, but these errors were encountered: