Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAUGE multiprocess_mode single stat #154

Closed
shlimp opened this issue Apr 5, 2017 · 15 comments
Closed

GAUGE multiprocess_mode single stat #154

shlimp opened this issue Apr 5, 2017 · 15 comments

Comments

@shlimp
Copy link

shlimp commented Apr 5, 2017

in some cases, when using gauge in multi process, what I really care about is the latest value updated by any of the workers.
for example, let's say I have a webapp running using uwsgi. this app updates some gauge on each request. what I really want to know is the latest value updated. in that case, I don't really need a separate db file for each pid, a single file would be sufficient to update it and read from it (basically mimicking single process but using a db file to store it across processes)

@brian-brazil
Copy link
Contributor

A single file would have locking issues. I'd guess that the min or max mode will do what you need.

@shlimp
Copy link
Author

shlimp commented Apr 5, 2017

problem with using min or max is that I would not know which one should I use. this is a gauge, not a counter so I do not know if my value have increased or decreased during the last process.

in my case for example I have a jobs gauge with 2 labels. total and done. total is constant while done is increasing until it reaches total. at that point the "master" job is done, and both these labels are now 0 again until the next "master" job starts.
so it each time I will have both 0 and x in total for example, and I have no way to choose between them

@brian-brazil
Copy link
Contributor

If it's always 0 and x, then you want max.

In general it sounds like you want the default per-pid behaviour.

@shlimp
Copy link
Author

shlimp commented Apr 5, 2017

I think maybe I haven't explained myself good enough here.
my starting point is 0 for total and 0 for done.
a new job comes in, setting total to X and done is now increasing as jobs are done.
in this situation, I now may have a few pids, some with 0 and some with X. in this case MAX will indeed work.
after done reaches X, I now set total and done to 0 again, since everything is done and I am waiting for the next master job.
in that case, I need MIN and not MAX since the actual value is now 0, yet I may still have some pids with X.
basically I need to know what is the LATEST value set by any of the workers, otherwise I have no way to determine weather I should now use MIN or MAX.

@brian-brazil
Copy link
Contributor

There's some form of IPC going on here that I don't understand. I'd suggest getting whatever is managing all this work to be the only one setting the gauge.

@shlimp
Copy link
Author

shlimp commented Apr 5, 2017

this is what happening, the problem is that manager is actually a uwsgi server (the workers themselves are separate servers). so as this being a multiprocess wsgi server, it is the only one writing the gause, but still in different pids.
but regardless of this specific use-case, I can think of other more common use-cases for this problem
even a simple webserver gauging number of active users for example. what I want from a gauge is the latest value set. how will I know which is the latest in this case?

since gauges are single numerical value, I think it is important to have a way to know what the latest value is, even across multi processing.

@brian-brazil
Copy link
Contributor

a simple webserver gauging number of active users for example

You want livesum for that. No process knows what the others are doing, so you have to add them.

The latest value doesn't make sense in a multi-process app, as there's nominally independent processes.

@shlimp
Copy link
Author

shlimp commented Apr 5, 2017

I was thinking in the direction of live modes, yet I didn't find any method in uwsgi to know when a process is dead and calling multiprocess.mark_process_dead, and I am still not sure weather it will give me the end result I need, since I still may end up with multiple processes with different values, even though they are alive.
anyway, I will try to figure this one out and maybe find some kind of hack to get the latest value I actually need in this case.
if I will think of some kind of a solution (without creating a locking issue ofc), will this use-case be interesting enough to contribute?

@brian-brazil
Copy link
Contributor

I still don't know enough about your use case to understand if this makes sense in the first place.

@shlimp
Copy link
Author

shlimp commented Apr 5, 2017

I think the simplest way I can define it is something like a shared state between processes.
my gauge is a "state" that is being updated by multiple processes, but it should be a single value at all times.
for now I can use Gauge.set_function to handle this value myself using some kind of shared memory (that way I can manage the multiprocessing myself and not use the built in module)

@flixx
Copy link

flixx commented May 22, 2020

@brian-brazil
Hello
We have the same issue as @shlimp .

We have a metric which goes up and down. (E.g. the number of admin users).
To collect the right data for the metric, we count all admin users in the database and update the Gauge accordingly, whenever somebody updates the admin users.

What we are interested about is the latest value of the gauge - no matter what the other processes know about it.

As an example:

  • I have 0 admin users initially
  • I add 1 admin users with PID 1 (PID 1 reports 1)
  • I add 2 admin users with PID 2 (PID 2 reports 3)
  • I remove 1 admin user with PID 3 (PID 3 reports 2)

What is relevant here, is the latest value, which is 2, which is neither the sum, max or min.

Without knowing the internals of the library, here are a couple of uninformed ideas of how to solve it:

  • Notify the other processes to also update the gauge
  • Only allow the process with the latest update to report
  • Allow Gauges to pull the values directly from the database when pulling from the /metrics/ endpoint
  • One file for each Gauge with some wait-for-lock-release handling.

@flixx
Copy link

flixx commented Jun 11, 2020

I was able to work-around this issue by using Custom Collectors . Basically, instead of counting a metric, that is stored somewhere, the metrics genrated on-the-fly, whenever /metrics/ is called. The values come directly from the database and are not process dependant.

@singerjess
Copy link

Any solution for this? Having the same issue here, I want to avoid having to store items in a database just because of sync issues with Prometheus.

@zevbo
Copy link

zevbo commented Feb 21, 2023

I second @singerjess about this being important.

@brian-brazil: I understand that there's some concern about requiring a lock. My thought though is that if you don't care so much about order always being exactly correct, then no lock and simply attempting to take the most recently set value would be good enough. Given that Prometheus we're mainly dealing with large scale aggregate knowledge, my bet is this would be satisfactory for many people, including myself.

@csmarchbanks
Copy link
Member

For anyone coming here, #847 is open which should solve the remaining use cases I am seeing in the comments. Since this issue is almost 6 years old let's work over there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants