Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafana autocomplete of labels issue #186

Closed
geekhead opened this issue Jun 28, 2019 · 15 comments · Fixed by #197
Closed

Grafana autocomplete of labels issue #186

geekhead opened this issue Jun 28, 2019 · 15 comments · Fixed by #197

Comments

@geekhead
Copy link

Hey, just been trying out promxy sitting in front of two VictoriaMetrics instances and noticed that in Grafana the autocomplete of labels do not work as expected. I have to create another curly brace to get it to pop up but having double curly braces is invalid. It's weird but I've attached some screenshots of it.

This screenshot is showing the correct behavior going straight to my VictoriaMetrics backend:
Screen Shot 2019-06-28 at 11 11 54 AM

This screenshot is going directly to Promxy but it's not autocompleting until I do another curly brace:
Screen Shot 2019-06-28 at 11 13 26 AM

I'm not sure which side is the culprit but let me know if you need any further info. Thanks

@jacksontj
Copy link
Owner

That is interesting. When I run grafana I actually don't get the autocomplete until I type the equal, a quote, and a character (both for prometheus direct and promxy):

Screenshot from 2019-06-28 11-23-30
Screenshot from 2019-06-28 11-24-00

Promxy doesn't implement anything special for grafana, so the easiest way to check if there is a difference is to check the API call that grafana is making. If you look in developer tools in your browser you'll see the 2 calls:

Screenshot from 2019-06-28 11-27-28

So to compare I'd look to see if those results vary -- they should be identical (assuming its promxy in front of the single instance). If they vary, could you share the results?

@geekhead
Copy link
Author

geekhead commented Jul 1, 2019

@jacksontj Thanks for the response. I'll check it out and report back.

@geekhead
Copy link
Author

geekhead commented Jul 2, 2019

@jacksontj The issue appears to be Promxy returning a status code 422 when Grafana tries to send the metric name via an XHR request to get a list of labels.

Here's the result going directly to a VictoriaMetrics backend data source:
Screen Shot 2019-07-02 at 11 41 08 AM

Screen Shot 2019-07-02 at 11 41 19 AM

And going directly to Promxy:
Screen Shot 2019-07-02 at 11 42 18 AM

Screen Shot 2019-07-02 at 11 42 24 AM

I also see the same issue being reported in the Promxy logs:
[ip address] - - [02/Jul/2019 15:48:41] "GET /api/v1/series HTTP/1.1 422 168" 0.001203 match%5B%5D=%7B__name__%3D%22up%22%7D

@jacksontj jacksontj added the bug label Jul 2, 2019
@jacksontj
Copy link
Owner

Thanks for the details! With that I was able to reproduce the issue, and unfortunately its an upstream client issue (prometheus/client_golang#614); TLDR if you don't pass time to that API call (as grafana hasn't) prometheus sets a huge time range which can't validly be converted to the timestamp format prometheus requires-- so it fails. I have a few ideas on fixes, but we'll see what ideas upstream has :)

@jacksontj
Copy link
Owner

I have a fix open upstream for this (prometheus/prometheus#5734) but unfortunately that'll require a change to the server (prometheus or victoriametrics) the root cause is a shortcoming in stdlib's time.Parse.

@jacksontj
Copy link
Owner

I have also created an issue on VictoriaMetrics (VictoriaMetrics/VictoriaMetrics#88) to handle this case. Unfortunately this is a server-side issue (e.g. prometheus or VM) so not a ton I can do from the promxy side.

@jacksontj
Copy link
Owner

So while applying the upstream fix into promxy (solving the case where promxy is the downstream) I decided to add in a promxy-side workaround as well. Both of these are in #194 which is included in https://github.com/jacksontj/promxy/releases/tag/v0.0.43

I'll goahead and close out this issue as there is a pending fix upstream and a workaround within promxy on the latest release. If you are still seeing issues feel free to reopen, but from my testing they should be fixed :)

@geekhead
Copy link
Author

geekhead commented Jul 7, 2019

Unfortunately, the issue still persists. I no longer get a 422 status code but it still doesn't pre-populate the labels and I see only this in the promxy debug logs: DEBU[2019-07-07T15:17:26Z] Select matchers="[__name__=\"netdata_system_ram_MiB_average\"]" selectParams="<nil>" took="980.739µs" 172.31.69.141 - - [07/Jul/2019 15:17:26] "GET /api/v1/series HTTP/1.1 200 59" 0.001275 match%5B%5D=%7B__name__%3D%22netdata_system_ram_MiB_average%22%7D

@jacksontj
Copy link
Owner

jacksontj commented Jul 7, 2019 via email

@jacksontj
Copy link
Owner

Linking over here-- #193 (comment)

There was an issue with the workaround for timezones forward of GMT, fix is is #195 .

jacksontj added a commit that referenced this issue Jul 7, 2019
jacksontj added a commit that referenced this issue Jul 7, 2019
@jacksontj
Copy link
Owner

Once the better client fix is merged (prometheus/client_golang#617) I'll actually remove the workaround entirely as it won't be needed.

@jacksontj
Copy link
Owner

@geekhead FYI I was able to reproduce similar behavior prior to the most recent fix, the problem I saw was an int64 overflow for UnixNano() which ended up making the "startTime" hugely positive and the "endTime" hugely negative. So I expect that the issue is gone in this new release (I am no longer able to repro), as always if you still see the issue definitely reopen :)

@hatemosphere
Copy link
Contributor

@jacksontj with 0.0.44 we are now getting another kind of error on HPA:

E0710 10:32:30.184118       1 provider.go:207] unable to update list of all metrics: unable to fetch metrics for query "traefik_entrypoint_requests_total{namespace!=\"\",pod!=\"\",job=\"traefik-ingress-external-addon\"}": execution: 422: "end"=9223309901257973760ms is out of allowed range [-9223372036854 ... 9223372036854]
E0710 10:32:30.190915       1 periodic_metric_lister.go:60] unable to update list of all metrics: unable to fetch metrics for query "rabbitmq_queue_messages_ready": execution: 422: "end"=9223309901257973760ms is out of allowed range [-9223372036854 ... 9223372036854]

should i create another issue or?

@jacksontj
Copy link
Owner

@hatemosphere lets create another issue for that, that seems to be something with the downstream, when you create the issue can you provide details on the setup you have (I don't have a provider.go in promxy, so presumably thats VM or prom?)

@jacksontj
Copy link
Owner

@hatemosphere looks like your issue is from VictoriaMetrics, and it looks like it was fixed in a later version (VictoriaMetrics/VictoriaMetrics@54bd21e seems to be the fix)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants