-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API returns negative values for some reply types #1315
Comments
I'm not sure if this is related to your issue, but you have some strange entries in your "Domains" list. Open your debug log:
They are invalid. |
Good spot, I'll fix those values now. |
Can confirm that fixing the above noted domainlist issues does not resolve this issue. New debug token: https://tricorder.pi-hole.net/RsXFGM3V/ |
I've transferred the issue as this happend also when directly querying FTL. |
As a note, this problem is resolved for a short period by either restarting |
@kura This means restarting |
@DL6ER yes, at least I am pretty sure that's the case. I've restarted it a few times over the last week without removing the DB. This resets the counters to 0 and they generally remain as positive integers for a few hours before switching and going negative from that point on. I have not noticed if the number decrements to 0 before going negative or if the value just flips from positive to negative. This is also a fairly recent problem, I consume the API values for my Grafana home network dashboard which is why I noticed the issue. It only cropped up after the most recent update a week or two ago. |
Thanks for the additional info, this is very valuable! I'll pay special attention to changes in the last update. |
@DL6ER have 100% confirmed that just restarting Edit: I just witnessed some of the |
Okay, looking at the code and reading ...
... again, I'm pretty sure this has to do with improper query-type restoring from the database. Only recently, we implemented the ability to preserve the entire Query Log across FTL restarts (before, some details like reply times and types were lost) and looking at the code, it seems we forgot to update the counters after loading from the database. This will cause the negative numbers to appear when the garbage collection routine says "I have to subtract 1000 queries of type X" but the counter never actually had these 1000 queries. The bug perfectly makes sense - so far. Now comes the bit where I'm a bit concerned that this may not be the final solution as I also read
Without any database being there initially, the problem shouldn't have appeared in the first place as no history data could have been imported improperly. However, you may have removed the database while FTL was still running, causing FTL to restore some of it when shutting down for a restart. @kura cutting a long story short, please try my proposed fix by running
and check if the issue is resolved |
@DL6ER I was looking back through my command history and there is a detail I left out about removing the DB which is that I ran I have switched my branch to |
One difference I have noticed is the branch checkout triggered a pihole-FTL service restart and my counters were not reset to 0, so I manually restarted the service to double check and can confirm they are correctly being transferred between restarts now which is not what was happening before. Which makes sense given the counter increment in the PR. |
I'm affected as well... going to run the custom branch.
|
Can confirm I've been running the custom branch for 6 hours now, values have always been positive integers. I'd say you've fixed it with this change @DL6ER.
|
@kura Can you confirm the numbers don't turn negative anymore. I do still get negatived ones even with the custom branch sometimes....
|
@yubiuser the numbers have not gone negative for me since Friday. I restarted the Pi that pihole is running on 2 days ago for OS patches and the
|
Nice. How did you generate the data and graph. I could use it as well to look for patterns when the switch happen for me |
@yubiuser I have a TIG stack running in my network that I use to monitor servers/services etc. Telegraf consumes the data from the pi-hole API every 5 seconds, inserts it in to InfluxDB and I have Grafana for displaying the data. So I just made a quick graph with 4 days of data. |
I have not setup any of those tools so far - maybe a good time to learn something new. |
It's pretty quick and painless, add a couple of repos to the system and install grafana, telegraf and influxdb or you can do down the Docker route if you're more familiar with that approach. The telegraf config section I use for Pi-hole metrics is as follows;
Once you have some stuff graphed, if you're like me you'll find yourself wanting to graph as much as possible. I graph all my internal network stuff (router, switch, NAS etc), internet connection (upload, download, packet loss, latency), all my servers including my Raspberry Pis (CPU, RAM, disk, services etc) and containers. Super nerdy but often helpful for diagnosing problems (graphing this stuff is how I noticed the API returning negative values.) |
@kura Can you compare the sum of the reply types with the total number of queries on your system? When we're already talking about grafana: There are plans to add a native Prometheus metrics endpoint to the API. To be expected in Pi-hole v6.x (where x isn't necessarily 0) |
@DL6ER sure. A sum of {
"dns_queries_today": 43560,
"dns_queries_all_types": 43560,
"reply_NODATA": 5224,
"reply_NXDOMAIN": 1920,
"reply_CNAME": 8001,
"reply_IP": 24111
} I would very much like a native endpoint, although the API gets me by quite nicely anyway. |
Okay, that's not too surprising to see a difference. In the end, we are only showing the most important 4 out of the 12 reply types FTL supports here. As I don't recall the reasoning for including only a subset of the reply types, I added the remaining individual reply counters as well as their sum ( Please update on branch |
@DL6ER I checked out the
|
Ah, sorry. I was distracted because the baby woke up almost immediately after my message above. I see now that the new FTL binaries have been rejected because our automated tests found that the API response changed in an unexpected way so you received the old version with your checkout. I added the new quantities to the tests now. Please try again. |
No worries! I have updated it and used a quick bit of Python to confirm that the summed values do indeed match the total. In [1]: s = """reply_UNKNOWN 2206
...: reply_NODATA 5827
...: reply_NXDOMAIN 2078
...: reply_CNAME 9460
...: reply_IP 26458
...: reply_DOMAIN 1479
...: reply_RRNAME 38
...: reply_SERVFAIL 1364
...: reply_REFUSED 0
...: reply_NOTIMP 0
...: reply_OTHER 0
...: reply_DNSSEC 0
...: reply_NONE 0
...: reply_BLOB 22"""=
In [2]: sum(int(b.split(" ")[1]) for b in s.split("\n")) == 48932
Out[2]: True Any chance that you'll keep all these reply types when this is merged in to master and expose them via the API? It'd be nice to graph more than just the 4 exposed currently. :D |
Yeah, sure. They'll stay. The only bit left is to confirm that everything does line up now. Also pinging @yubiuser who was still seeing negative numbers occasionally. |
Scratch that, I just noticed that
The summed
|
Okay, so this looks like an issue with It'd also be interesting to see if you get any results for
after your checkout. |
No worries! I don't mind being a guinea pig and I'm not doing much but tinkering away on my own coding projects today anyway!
In [1]: # dns_queries_all_types 48042
In [2]: # dns_queries_today 48042
In [3]: # dns_queries_all_replies 48042
In [4]: s = """reply_UNKNOWN 1863
...: reply_NODATA 5810
...: reply_NXDOMAIN 2094
...: reply_CNAME 9363
...: reply_IP 26031
...: reply_DOMAIN 1469
...: reply_RRNAME 40
...: reply_SERVFAIL 1350
...: reply_REFUSED 0
...: reply_NOTIMP 0
...: reply_OTHER 0
...: reply_DNSSEC 0
...: reply_NONE 0
...: reply_BLOB 22"""
In [5]: sum(int(b.split(" ")[1]) for b in s.split("\n")) == 48042
Out[5]: True As for warnings, Edit: Updated to reflect the additional new |
Confirm no negative reply types anymore. |
Released with https://github.com/pi-hole/FTL/releases/tag/v5.15 |
Versions
Platform
Expected behavior
When hitting the API endpoint I would expect the value for replies to be positive numbers but instead I see some negative values.
Actual behavior / bug
When querying the API some of the
reply_
values are negative.This is also the case when querying the stats directly from pihole-FTL so the issue may be there.
Steps to reproduce
Steps to reproduce the behavior:
I do not really know how to reproduce this behaviour, just that it happens for me on 2 different instances (1 local on a Raspberry Pi, the other on a cloud VPS.)
Debug Token
The text was updated successfully, but these errors were encountered: