Redis data store: Why are per-process keys required for MIN/MAX aggreggations? #1

olofhe · 2020-05-20T14:56:19Z

Hi!

I might be missing something, but I don't understand the need to keep individual Redis hash keys per process in order to support MIN and MAX aggregation of metrics.

It seems completely possible to only keep one key for the maximum or minimum reported value in Redis.

For MAX, simply just write the new value v_new to key k iff:

k has no old value v_old

OR

v_old < v_new

and the same for MIN, but with a flipped inequality sign.

AFAIK, there is no built-in Redis function to do this atomically in a single command (i.e. without the risk of some other process modifying the value in between read/write), but atomicity can easily be ensured using EVAL with a Lua script.

Here's an example ruby function that does this (extra verbose for readability).

def redis_set_if_greater(redis, key, val)
  lua_script =
    "local key = KEYS[1]; " \
    "local val = ARGV[1]; " \
    "local current_val = redis.call('get', key); " \
    "if (not current_val) or (tonumber(current_val) < tonumber(val)) then " \
      "redis.call('set', key, val); " \
    "end"
  redis.eval(lua_script, [key], [val])
end

Example program:

r = Redis.new
r.del "mykey"
[2, 1, 3, 2].each do |x|
  redis_set_if_greater(r, "mykey", x)
  puts "set #{x} => #{r.get("mykey")}"
end

Output:

set 2 => 2
set 1 => 2
set 3 => 3
set 2 => 3

Is there something I haven't considered? :)

Best,
Olof

The text was updated successfully, but these errors were encountered:

dmagliola · 2020-05-20T15:27:07Z

Hi!

Interesting question!
I'm not 100% confident on whether this is possible, but some thoughts:

The main reason this wasn't done this way is not necessarily that it can't be done, but that we just didn't. In other worse, there's a lot that we haven't considered. :)
The Redis Data Store was an experiment in performance for a data store that would share data between processes. Its performance was much worse than we would have expected, and much worse than other simpler options we had, so we abandoned it pretty early on. As such, it didn't get as much work and as much thought as the other data stores we are supporting.
One thing your solution wouldn't work with is increment and decrement. If you are inc/dec'ing gauges in different processes, and you then want to aggregate them and get the MIN/MAX, it's not going to do what you wanted...
This is similar to a discussion we had in a recent PR in the official client: Add :most_recent aggregation to DirectFileStore prometheus/client_ruby#172
You may find that one interesting, and there may be other related insights in there.

Just out of curiosity... Are you using this Redis-based store?
I'd like to hear more about under what circumstances it it useful!

olofhe · 2020-05-20T15:58:18Z

Hi!

Thanks for a very quick reply!

I hadn't considered incrementing/decrementing gauges. That indeed wouldn't work with my solution.

I'm surprised to hear that the Redis store performed so poorly! Instinctively, it felt like the more "natural" choice for a multi-process application. But I suppose I should focus on the officially supported DirectFileStore then :)

I'm very new to Prometheus. I was looking into setting up some metrics for a web application at work, using Rails and Sidekiq on a Docker Swarm, and was just looking at the various stores. So I haven't set up anything yet, and unfortunately can't speak for the usefulness of the Redis store. :)

Regards,
Olof

dmagliola · 2020-05-21T09:01:05Z

Instinctively, it felt like the more "natural" choice for a multi-process application

I feel this too! So much so, that out of that instinct and the surprise of the result was born a conference talk, because I just couldn't believe it...

And yes, given your use cases, I'd recommend sticking to the "official" supported data stores until you have a more specific need. This repo was mostly for documenting experiments, results, and for anyone that might want to try the mmap store, which we don't officially support, but it's technically a bit faster than DirectFileStore

dmagliola closed this as completed May 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis data store: Why are per-process keys required for MIN/MAX aggreggations? #1

Redis data store: Why are per-process keys required for MIN/MAX aggreggations? #1

olofhe commented May 20, 2020

dmagliola commented May 20, 2020

olofhe commented May 20, 2020

dmagliola commented May 21, 2020

Redis data store: Why are per-process keys required for MIN/MAX aggreggations? #1

Redis data store: Why are per-process keys required for MIN/MAX aggreggations? #1

Comments

olofhe commented May 20, 2020

dmagliola commented May 20, 2020

olofhe commented May 20, 2020

dmagliola commented May 21, 2020