Rate of increase for monotonic counter #60619

wylieconlon · 2020-08-03T19:20:47Z

Elasticsearch should provide a new metric aggregation for use only in date histograms, which is able to calculate the increase in a monotonic counter. Because the value of a counter is always increasing, it occasionally resets from the maximum value to 0. These resets should be handled automatically by the aggregation. This aggregation requires documents to be sorted in increasing time order.

This aggregation should throw an error if values aren't monotonically increasing. The most common reason for this will be multiple sources of documents, such as multiple servers with separate counters. The error message should indicate to the user to add another bucket aggregation such as terms of host.name.

The aggregation should also allow scaling to a time unit like the derivative pipeline aggregation.

Use cases for this already exist in most beats modules. For example, system.network.in.bytes is a counter-type field that will generally be converted into a "rate per second."

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-03T19:20:49Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

wylieconlon · 2020-08-06T20:08:31Z

I've created a mockup to show that the rate should be calculated by looking sequentially at individual documents. If the counter resets to 0 in the middle of a bucket, the rate shouldn't be affected.

dgieselaar · 2020-08-12T12:31:05Z

Adding our use case: In APM we use a combination of a max, derivative, and bucket_script aggregations to display a monotonically increasing counter for our garbage collection charts.

{
  aggs: {
    over_time: {
      date_histogram: getMetricsDateHistogramParams(start, end),
      aggs: {
        // get the max value
        max: {
          max: {
            field: fieldName,
          },
        },
        // get the derivative, which is the delta y
        derivative: {
          derivative: {
            buckets_path: 'max',
          },
        },
        // if a gc counter is reset, the delta will be >0 and
        // needs to be excluded
        value: {
          bucket_script: {
            buckets_path: { value: 'derivative' },
            script: 'params.value > 0.0 ? params.value : 0.0',
          },
        },
      },
    },
  }
}

wylieconlon · 2020-08-12T14:40:23Z

@dgieselaar The main reason to ask for a new metric aggregation in Elasticsearch is to avoid the edge cases with the approach you've described:

In the diagram I drew above, the Max of field is highest right around the transition from Day 1 -> Day 2. Therefore the rate would actually be wrong when calculated by using max (the rate could ignore 23 hours of the day)
In the case of a metric that is tracked from multiple sources, each counter would have different values. So by using Max of field you'd only see the value of one of the counters, not all of the counters. Imagine that you have 10 servers and calculate the rate based on Max of network.in.bytes- this number would be off by 10x.
Finally, because you're using Max as the metric, the bucket after the counter reset will always be 0. This is because the counter could reset on Day 2, and then the Max of Day 2 is higher than the Max of Day 3.

Here's an example of three separate counters that might have different resets:

Based on this example, I would expect that:

If the rate of increase were requested for all three counters together, I would expect an error because the values aren't always increasing
If the rate of increase were requested for all three counter separately, it should return the results on the right instead of zeroes

not-napoleon · 2020-09-17T21:08:21Z

Hey, I just want to check in on the requirements here. I spoke a bit with @wylieconlon , and he suggested I tag @exekias for input as well. Here's a couple of scenarios I'm looking at, and would like your feedback on. In all examples, I'm showing data as pairs of numbers, with the first representing a time and the second representing the counter value (I'm assuming it's in bytes of network data, just to have some unit to talk about). For ease of typing, I'm writing time in seconds from some nominal T=0 which will be the start of our observations. Obviously in a real application these would be milliseconds since epoch timestamps.

Simple case: (0, 1000), (10, 1100), (20, 1200), (30, 1300), (40, 1400), (50, 1500), (60, 1600)

In this case, we have a total of 600 bytes over 60 seconds, for a rate of 10 bytes / second, assuming a 1 minute bucket.

Spike case: (0, 1000), (10, 1000), (20, 1000), (30, 1000), (40, 1600), (50, 1600), (60, 1600)
In this case, there's a lull in traffic followed by a spike at 40 seconds, but for the whole minute bucket, we still transferred 600 bytes in 60 seconds, for a rate of 10 bytes / second.

Reset case 1: (0, 1000), (10, 1100), (20, 100), (30, 200), (40, 300), (50, 400), (60, 500)
This gets a little trickier, but I still think it's describing a 10 bytes / second rate. From 0 to 10, we observe 10 bytes / second. We don't know what happened between 10 and 20, because there's a reset discontinuity, so we ignore that block. Then from 20 seconds to 60 seconds, we observe 400 bytes in 40 seconds, still a 10 bytes / second rate

Reset case 2: (0, 2^32 - 1000200), (10, 2^32 - 1000100), (20, 100), (30, 200), (40, 300), (50, 400), (60, 500)
I'm including this case because I've heard from a couple of folks "2^32 - 1000000 followed by 100, the rate is 1000100", but to me this is the same as the above example. You can interpret that sequence of data as "there was a spike where we shipped a megabyte in 10 seconds and rolled over the counter" or you can interpret it as "the monitoring agent got reset in that window". We don't know which happened, and there isn't anything in the data to tell us. To my mind, the only not wrong thing we can do is ignore that interval.

Thanks in advance for your input.

exekias · 2020-09-18T09:46:08Z

I see the point about not interpreting that counter reached MAX_INT when we have a reset, but we still have some information after reset:

For case 1: (0, 1000), (10, 1100), (20, 100), (30, 200), (40, 300), (50, 400), (60, 500)

at time 20 you detect the counter reset, ignoring it would mean reporting a rate of 0 here (?). Still we know that it raised by at least 100 since the previous sample. So you could "interpret" the data as this, making no assumptions on the max number that was reached before reset:

(0, 1000), (10, 1100), (20, 1100+100), (30, 1100+200), (40, 1100+300), (50, 1100+400), (60, 1100+500)

For case 2: (0, 2^32 - 1000200), (10, 2^32 - 1000100), (20, 100), (30, 200), (40, 300), (50, 400), (60, 500)
This would be:

(0, 2^32 - 1000200), (10, 2^32 - 1000100), (20, 2^32 - 1000100 + 100), (30, 2^32 - 1000100 + 200), (40, 2^32 - 1000100+ 300), (50, 2^32 - 1000100 + 400), (60, 2^32 - 1000100 + 500)

This, when compared to taking the positive values only at least takes into account the data we have just after a counter reset, which may of course be incomplete, but better than filling with a 0.

It would be good to also try to think about these scenarios when samples are split in several buckets, for instance, 10s bucket would leave you with one value per bucket. I understand this aggregation would take the value from the previous bucket into account when calculating the rate for the next one?

wylieconlon · 2020-09-18T17:10:59Z

@exekias I think you are describing the following algorithms in pseudocode.

When the counter decreases, ignore the possibility of an overflow and just use the new value

rate = 0
loop_over_values(lambda (current, previous):
  if current >= previous:
    rate = rate + (current - previous)
  else:
    rate = rate + current
)

This is different from the other algorithms that we could use, which are:

Try to determine the overflow amount when the counter decreases, by using `rate = rate + (sys.maxint - previous) + current
Treat any reset as if it's adding 0, by keeping rate the same.

If we had perfect information, I think 2 would be correct most of the time. But I see your point that without perfect information, 1 might be the closest.

There is an edge case that happens pretty often in Metricbeat data, which I want to add here. If the user is requesting the positive rate of a field that is coming from multiple counters, none of these algorithms will catch this and all will produce crazy results:

Time	Source	Value	Rate algorithm 1	Rate algorithm 2	Rate algorithm 3
0	A	9000	9000	9000	9000
1	B	100	9100	2^32 - 8900	9000
2	A	9100	18200	200	18000

The question I have for all of you is: are we okay with the potential for user errors here? I think it could go both ways.

matschaffer · 2021-02-17T21:34:50Z

Any new year updates on this? Just had it pop up again in a troubleshooting session regarding a graph of node_stats.os.cgroup.cpuacct.usage_nanos in .monitoring-es-* from stack monitoring.

Given our own components have a lot of monotonic counters, it'd be great to have better support for them in ES.

For the time being we can get by with TSVB's "Positive Rate" and a 1k "top" value at least.

imotov · 2021-02-22T17:31:46Z

@matschaffer It turned out to be much trickier than we originally thought. The main challenge here is scalability. In the current framework, the data that we get is not sorted and distributed across multiple shards. So, in general the issue is not really solvable unless we ship all data to a coordinating node and sort it there unless we change the way we store the data or come up with some heuristic approach.

matschaffer · 2021-02-25T04:32:27Z

That’s unfortunate, but understandable.

Should we pursue something at another layer maybe? (Roll up counters to gauges for example)

The stack itself produces a many counters today and we’ll probably get more over time (thinking especially about cases like apm agent collecting Prometheus counters).

@jasontedor or @tbragin any thoughts on how we should proceed?

imotov · 2021-02-25T15:59:52Z

@matschaffer we understand the importance of this feature and it is still high on our priority list. One of the ideas that we wanted to prototype is timestamps sorted indices routed by the counter id. That would allow us to resolve some of the issues mentioned above. There is still an issues of index rollover, but somewhat smaller one if we can ensure that the earliest data point in the later index is always after the latest data point in the earlier index, which would also require some sort of routing mechanism that we don't have at the moment. Another approach that we discussed was to create some sort of streaming API producing sorted data to clients, again tricky and not ideal since each client will have to do their own implementation and we will not be able to wrap it into other aggregations, etc.

matschaffer · 2021-02-26T01:12:03Z

Yeah, streaming API to clients sounds like it could be tricky to build visualizations and alerts on (which would be the end usage for many of these counters). Thanks for confirming the priority. Hopefully we can find some path forward.

We'll keep our top:N graphs and alerts tuned high (at least 1k) in the mean time to help avoid fresh counters getting overlooked.

wchaparro · 2022-03-21T15:48:47Z

#74660

martijnvg · 2023-03-10T15:39:25Z

I think the rate agg on counter field implements what is being asked here.
The rate agg on counter field for tsdb will be in tech preview in 8.7.0

wylieconlon added the :Analytics/Aggregations Aggregations label Aug 3, 2020

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 3, 2020

imotov mentioned this issue Aug 4, 2020

Add new rate aggregation #60674

Closed

wylieconlon mentioned this issue Aug 6, 2020

[lens] Add "counter rate" for monotonically increasing numbers elastic/kibana#46627

Closed

imotov self-assigned this Aug 31, 2020

imotov added >feature stalled labels Oct 29, 2020

henrikno mentioned this issue Nov 25, 2020

Micrometer integration improvements elastic/apm-agent-java#1476

Closed

imotov mentioned this issue Jan 19, 2021

Add support for multi-field keys to terms aggs #65623

Closed

imotov assigned nik9000 and unassigned imotov Mar 3, 2021

exekias mentioned this issue Apr 27, 2021

Add a way to downsample metrics #66247

Closed

imotov mentioned this issue Apr 20, 2022

Ability to summarize running total value in long term #59684

Closed

martijnvg closed this as completed Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate of increase for monotonic counter #60619

Rate of increase for monotonic counter #60619

wylieconlon commented Aug 3, 2020

elasticmachine commented Aug 3, 2020

wylieconlon commented Aug 6, 2020

dgieselaar commented Aug 12, 2020

wylieconlon commented Aug 12, 2020 •

edited

Loading

not-napoleon commented Sep 17, 2020

exekias commented Sep 18, 2020

wylieconlon commented Sep 18, 2020

matschaffer commented Feb 17, 2021

imotov commented Feb 22, 2021

matschaffer commented Feb 25, 2021

imotov commented Feb 25, 2021

matschaffer commented Feb 26, 2021

wchaparro commented Mar 21, 2022

martijnvg commented Mar 10, 2023

Rate of increase for monotonic counter #60619

Rate of increase for monotonic counter #60619

Comments

wylieconlon commented Aug 3, 2020

elasticmachine commented Aug 3, 2020

wylieconlon commented Aug 6, 2020

dgieselaar commented Aug 12, 2020

wylieconlon commented Aug 12, 2020 • edited Loading

not-napoleon commented Sep 17, 2020

exekias commented Sep 18, 2020

wylieconlon commented Sep 18, 2020

matschaffer commented Feb 17, 2021

imotov commented Feb 22, 2021

matschaffer commented Feb 25, 2021

imotov commented Feb 25, 2021

matschaffer commented Feb 26, 2021

wchaparro commented Mar 21, 2022

martijnvg commented Mar 10, 2023

wylieconlon commented Aug 12, 2020 •

edited

Loading