-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to plot latency and request per second with opentelemetry's Histogram type? (Kind: Cumulative) #528
Comments
It looks like delta aggregation for int64 values isn't permitted for custom metrics https://cloud.google.com/monitoring/api/v3/kinds-and-types#kind-type-combos. @dashpole do you have any context on that? |
I don't. @liufuyang what options are available for |
Hey there, sorry for the inconvenience, I think we fixed our issue by updating to the newest version einride/cloudrunner-go#340 Thank you for the help and I will just close it down for now. We can reopen this if we still find other problems related to using opentelemetry exporter with Histogram metrics. |
Ah, glad to hear that resolved things. |
Thanks. By the way, since you are on top this now, do you know how to use MQL to draw or derive the request rate from the CUMULATIVE duration Histogram data? I know that in PromQL something like this could do:
But on the MQL side, I am not sure how to do it. Thank you :) |
Try the |
Aha, nice thank you very much :D |
Based on https://cloud.google.com/monitoring/charts/charting-distribution-metrics, it seems like maybe Alternatively, you can actually use promql to query these metrics if you want: https://cloud.google.com/stackdriver/docs/managed-prometheus/promql |
(but sum also doesn't seem to do what I want either) |
Actually, I think I found it. count_from seems to give the number of events in the distribution.
Runs for me |
@dashpole Sorry to bother you again, I think I need the last bit of help here so we could use those metrics nicely in production. The question I have is how to plot the ratio between the two group's requests rate? As we know above by using I've tried it like this:
But it gives a quite wrong-looking graph. What we need is basically during a time window, let's say 5 minutes, how many percentages of the requests have rpc_grpc_code not as It would be very appreciated if you could give us a hand on this. I've tried to read the doc but could not understand MQL well, also asked on Stackoverflow however not many know the answer I am afraid. Thank you in advance. |
@liufuyang I'm quite a bit out of my MQL depth, but I think you might want to do your When I tried your query on the rtt metric above:
It gave me a graph with values between 0 and 5. But if I changed it to:
It went to a graph with values between 0 and 1, which is what I expected to see from a ratio. |
Aha, thank you so much @dashpole, by switching the group_by and filter_ratio_by indeed gives us correct-looking results 👍 Super appreciated your help on this 🙏 |
As you may know, this change been merged on the opentelemetry-go-contrib side recently to start reporting
rpc.server.duration
with meter created asOn our backend, we have a similar implementation. But when the data is exported to GoogleCloudMonitoring, we seem cannot find a good way to plot the latency graph.
The generated metric has a
Kind: CUMULATIVE
on it, as the picture 1 below, while comparing with a Google internal cloud run latency graph, the data hasKind: DELTA
, see in picture 2.So it is expected that the kind should be
CUMULATIVE
when GoogleCloudPlatform/opentelemetry-operations-go is used? And if so, how can I plot a latency graph on Google Monitoring?Thank you :)
Extra info:
I am not sure what this
aligner
really means here but when I choose our metric exported from this package, there is a singledelta
to choose.The text was updated successfully, but these errors were encountered: