This repository has been archived by the owner on Nov 8, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 294
snap high availability #773
Labels
Comments
i would also be interested in this. When creating a task for a "tribe" it would be great if there was an option for the task to run on only 1 of the tribe members rather then on all. If the 1 tribe member went off line, the task should then be picked up by another tribe member. |
+1, this is something that would be useful for integrating Snap with LMA/StackLight (a monitoring solution for OpenStack clouds). Typically StackLight get part of its metrics by querying OpenStack API endpoints, we do it from a single client at fixed intervals and if that client fails, we fail over to another client instance. |
Given that this is a question and it's covered by the other two RFCs, closing this thread out as a successful discussion of the need 👍 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Let's assume that I have a cluster of nodes that I'd like to monitor. I have a possibility (or must) to do this out-of-band (proxy mode), meaning that snapd is not installed on a target nodes but instead is installed on a dedicated nodes that can communicate with targets and retrieve metrics out-of-band (for example via. REST API or IPMI). Now, since it's important to have all the metrics all the time, I'd like to be sure that failure of the nodes hosting snapd is mitigated. As an example, there are 3 nodes hosting snapd, and only one is retrieving the metrics, but when it fails other node(s) take over flawlessly. I also don't want duplicated metrics, so those 3 snapds cannot run the same workflow as this will result in the same metrics being retrieved and published 3 times. Can tribe somehow help achieving HA in such scenario? Would we need new features to support it?
The text was updated successfully, but these errors were encountered: