Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

snap high availability #773

Closed
andrzej-k opened this issue Mar 16, 2016 · 4 comments
Closed

snap high availability #773

andrzej-k opened this issue Mar 16, 2016 · 4 comments

Comments

@andrzej-k
Copy link
Contributor

Let's assume that I have a cluster of nodes that I'd like to monitor. I have a possibility (or must) to do this out-of-band (proxy mode), meaning that snapd is not installed on a target nodes but instead is installed on a dedicated nodes that can communicate with targets and retrieve metrics out-of-band (for example via. REST API or IPMI). Now, since it's important to have all the metrics all the time, I'd like to be sure that failure of the nodes hosting snapd is mitigated. As an example, there are 3 nodes hosting snapd, and only one is retrieving the metrics, but when it fails other node(s) take over flawlessly. I also don't want duplicated metrics, so those 3 snapds cannot run the same workflow as this will result in the same metrics being retrieved and published 3 times. Can tribe somehow help achieving HA in such scenario? Would we need new features to support it?

@woodsaj
Copy link
Contributor

woodsaj commented Mar 22, 2016

i would also be interested in this. When creating a task for a "tribe" it would be great if there was an option for the task to run on only 1 of the tribe members rather then on all. If the 1 tribe member went off line, the task should then be picked up by another tribe member.

@simonpasquier
Copy link

+1, this is something that would be useful for integrating Snap with LMA/StackLight (a monitoring solution for OpenStack clouds). Typically StackLight get part of its metrics by querying OpenStack API endpoints, we do it from a single client at fixed intervals and if that client fails, we fail over to another client instance.

@mbbroberg
Copy link
Contributor

I'd like to consider this for our next round of roadmap planning @bjray @jcooklin

@mbbroberg
Copy link
Contributor

Given that this is a question and it's covered by the other two RFCs, closing this thread out as a successful discussion of the need 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants