-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow getting all catalog services for list of DataCenters #526
Comments
@sethvargo Any thoughts on the value of this, I'm happy to write it up if it would be valuable. |
Hi @Split3 We have not had a lot of requests for this functionality, so I do not think it's something we plan to add to Consul Template ourselves. If this is something that the community seems valuable, we would likely look to a community contribution for this. That being said, it does sound somewhat anti-Consul to ignore the per-datacenter constraints for a higher-level load balancer. You are also potentially introducing a very large number of watches (DC's x Catalog Service x Health Service) is at least an n^2 operation, if not n^3 for large numbers of data centers. I worry about the load on the Consul cluster and the performance implications of something like this. |
Our use case is as follows: We have multiple physical data centers for our services. What we want to do is configure the load balancers in each datacenter with all appropriate services in all datacenters; where the near datacenter (the same one the service exists in) is the primary backend for a load balanced end point, while the far datacenter are the backups. This would be the reason we would basically need to know about ALL services available to consul regardless of the datacenter it exists in. We use tags to limit the services that used by the load balancers accordingly but if we had a service in DC1 only we would want DC2 Load Balancers to also support routing traffic to those nodes accordingly. Hopefully that makes some sense, if not I can try and explain better, and if there is a better approach to handling this particular situation I'm all ears. |
Hi @Split3 this is a pretty interesting use case. If you weren't using load balancers, you could configure prepared queries to do this for you and it could figure out where the backup DC should be based on network round-trip time (or you could configure it). There's also an open issue to allow for a "parent" DC to be specified - hashicorp/consul#1159. Would either of these be an option for your infrastructure? |
I looked at leveraging prepared queries but they require you to specify the DC that is searched which wouldn't work in this particular use case. The reason behind this is to ensure the high level of availability even on a physical data center being unavailable. We also want this to be zero touch from a configuration stand point so the only way that would be somewhat possible would be to get all the available services in desired data centers, filter them by the associated tags we use. Then add the backends to the load balancer with the current data center being the primary (if they exist) and the other data centers being the backups (if they exist). Our load balancers for each DC are fronted by a GTM that balances between those physical end points based on speed, latency, and performance. This is why we want all services configured in both locations to simplify the overall routing from the GTM to each data center. |
I see. There's definitely concern about the large number of watches this will do, but how about something like this:
This'll loop through all services in all datacenters. |
Yea the problem isn't looping through, the problem is more having the data grouped together properly. For instance we need to produce something like In DC1 service foo {
backend foo-dc1-instance;
backend foo-dc2-instance backup;
} In DC2 service foo {
backend foo-dc2-instance;
backend foo-dc1-instance backup;
} So at a high level I need to know all available services and then when looping through the Health Services for the associated Data Centers I can associate the backup instance appropriately based on which data center I'm querying them from. What raises questions for me now is the load that @sethvargo mentioned. Are there any published numbers at what the loads can get to and what is tolerated by Consul? |
@Split3 it depends on the size of your cluster and the type of servers you're running honestly. You're essentially creating a watch on every non-key-value item in Consul, which means you have significant churn and data flow. It is very likely that you will constantly be restarting your load balancers because any change in any of the data in Consul will result in the template changing. As far as actual load, it depends how many instances of this template you're running. |
In terms of the template load that could be mitigated by Would a better approach be to use the KV store to |
@Split3 you are correct that
|
@slackpad Thanks for the attempt there but due to the way the template must be generated and the grouping needed this still wouldn't work as we need to create service blocks for all services in all datacenters. I've instead decided to go with a different approach and using the KV store to declaratively define the services that should be consumed from the registry. Though it now requires an initial "setup" step this will reduce the amount of load on the consul servers as we won't have watched configured on all the pieces mentioned by @sethvargo. Thanks for the help! |
The use case here is we want to be able to query all services in the consul registry meaning all services in all data centers. Allowing us to configure our load balancers accordingly with all possible service routes no matter what data center they are in.
Currently was thinking we could accomplish this by adding a function that takes an array of data centers or defaults to all of them to retrieve the catalog service for each data center. This would be done by first basically querying the data centers and then iterating over that to get all the catalog services for each datacenter.
This is mainly to get around the fact that the consul API doesn't support this kind _all_ searching currently. I am happy to take the time to contribute this if it is considered valuable.
The text was updated successfully, but these errors were encountered: