-
Notifications
You must be signed in to change notification settings - Fork 174
Support marking APIs as healthy/unhealthy #812
Conversation
Signed-off-by: Justin Kolberg <amd.prophet@gmail.com>
Signed-off-by: Justin Kolberg <amd.prophet@gmail.com>
Signed-off-by: Justin Kolberg <amd.prophet@gmail.com>
@cwjohnston had mentioned that we should mark the API as unhealthy if the stats from |
Codecov Report
@@ Coverage Diff @@
## master #812 +/- ##
==========================================
- Coverage 20.93% 20.79% -0.14%
==========================================
Files 28 28
Lines 2532 2553 +21
==========================================
+ Hits 530 531 +1
- Misses 1937 1957 +20
Partials 65 65
Continue to review full report at Codecov.
|
@amdprophet So uchiwa/uchiwa/daemon/daemon.go Lines 234 to 241 in 79440f3
|
@palourde the problem is trying to determine which API is unhealthy but I just realized that /info would mean the whole datacenter is unhealthy. Perhaps we shouldn't mark APIs as unhealthy based on the contents of the /info response else we should mark all APIs within the DC as unhealthy. |
@amdprophet can you elaborate on this? In my experience the /info API will return a HTTP 500 (internal server error) when that particular API host is disconnected from either Redis or RabbitMQ. Because /info only returns information about the API host being queried, I don’t think we can reasonably consider a whole datacenter down based on a a single API host’s state. On the other hand, the /health endpoint could be used to determine the health of a whole datacenter, as the combination of queue depth and number of consumers reflects the overall state of the datacenter. That said, I think that reporting on /health results is outside the scope of what we’re trying to do here. |
At this point, like Cameron mentioned, I feel like it's a different requirement and probably a different solution too. The health of Sensu itself (number of consumers, quantity of messages, etc.) shouldn't impact how Uchiwa deal with its datacenter APIs. However, we could definitely improve how Uchiwa reports the health of a datacenter with these metrics. |
@cwjohnston I was confusing the functionality of /info with /health and I agree with what has been said. I think this PR is in a state where it can be reviewed and merged if everything looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Adds the ability to mark an API as healthy/unhealthy. Uchiwa will only make requests to APIs marked as healthy. APIs marked as unhealthy will be watched be a goroutine and marked as healthy once they respond with an HTTP 2xx error.
Related Issue
Fixes #811.
Motivation and Context
See #811.
How Has This Been Tested?
I've tested with the following configuration:
I've tested Uchiwa with one or both APIs down, one API up and one API down, and then starting the APIs that were down to see APIs marked as healthy.
Screenshots (if appropriate):
Types of changes
Checklist: