-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/srv, embed, etcdmain: Support multiple clusters in the same DNS domain #8690
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8690 +/- ##
==========================================
- Coverage 75.87% 75.84% -0.04%
==========================================
Files 363 363
Lines 30161 30168 +7
==========================================
- Hits 22884 22880 -4
- Misses 5670 5689 +19
+ Partials 1607 1599 -8
Continue to review full report at Codecov.
|
/cc @hexfusion can you take a look of this PR? |
@xiang90, 👍 |
@xiang90 / @hexfusion Thanks for taking a look. Let me know how I can help. If this overall approach looks good I'm happy to rebase on master to resolve the conflict. |
etcdmain/config.go
Outdated
@@ -152,6 +152,8 @@ func newConfig() *config { | |||
|
|||
fs.StringVar(&cfg.Dproxy, "discovery-proxy", cfg.Dproxy, "HTTP proxy to use for traffic to discovery service.") | |||
fs.StringVar(&cfg.DNSCluster, "discovery-srv", cfg.DNSCluster, "DNS domain used to bootstrap initial cluster.") | |||
fs.StringVar(&cfg.DNSClusterServiceNameSSL, "discovery-srv-name-ssl", cfg.DNSClusterServiceNameSSL, "Service name to query when using DNS discovery for SSL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the configurable flag down to one? The goal is to be able to have N etcd discovery under one domain, not to make the ssl suffix configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we already know if etcd is using TLS at this point right, how can we get this information from etcd? Can you name your etcd service _etcd-server-ssl
then in the case of cluster 'a' cumulative SRV would be _etcd-server-ssl-a._tcp.service.consul. using
--discovery-srv-name a?
@tavish-stripe I am OK with the motivation. |
@tavish-stripe thank you for your contribution! Could you do me a favor and create an issue for this PR and briefly outline the limitations you faced with your DNS setup and how this PR resolved it? You could copy and paste a lot of what you have above. Would be good to track the issue while we work through this. I hope to have sometime to review after the holiday weekend. Thanks again, Happy Holidays! |
@tavish-stripe, just checking in. I am trying to better understand the issues you faced with your DNS setup. For example, why does this type of setup not work for you? It would help me a lot to get some context to why this change is necessary. cluster 1-–discovery-srv=foo.local ;SRV Records _etcd-client-ssl._tcp.foo.local. 300 IN SRV 0 0 2379 node1.foo.local. cluster 2-–discovery-srv=sub.foo.local ;SRV Records _etcd-client-ssl._tcp.sub.foo.local. 300 IN SRV 0 0 2379 node4.sub.foo.local. |
ping @tavish-stripe :) |
@hexfusion thanks for the ping! With our internal service configuration setup everything is under the same |
@tavish-stripe thank you for the reply, so if I understand you correctly your saying that your DNS setup somehow is not configurable to allow sub domains for SRV records. So that all SRV records are defined as the root domain? So as a work around you are passing the SRV name to etcd to circumvent this? I think that is a crafty solution :) but at that point we are hacking the DNS results which are supposed to be defining the discovery. So this would be a hybrid DNS discovery mechanism one that uses parts of the SRV records and manually overriding others. Please feel free to chime in if my assessment is wrong. Before we go down this route I would like to propose an alternative discovery mechanism based on TXT records and see if it would be useful for others. I believe it could also solve your problem? See #9128 |
@hexfusion sorry for the delay. Since we're using Consul for service discovery internally, it's very easy for us to register each etcd cluster in our consul cluster as a separate service. This allows us to use Consul's automatically generated SRV records for each etcd service in the cluster. As a result, we get SRV services registered at Using SRV discovery (with our own provided service names) allows us to easily use Consul DNS to bootstrap our etcd clusters without having to configure DNS specially for each cluster. The TXT solution you suggested does not allow us to leverage Consul's built-in functionality.
It might be helpful to clarify our use case. The solution I'm proposing uses all of the SRV responses and does not override any of them. Here's an example query and response generated by consul. There are two clusters here advertising a client service each (
Does that help clarify the use case? |
@tavish-stripe this is very helpful will review shortly. Thanks! |
The problem from my experience is is that SRV discovery causes a lot of confusion for many folks. So to @xiang90's point we want to keep this to a single flag for simplicity. We would also need to add documentation and tests.
will comment on the code now a bit. |
etcdmain/config.go
Outdated
@@ -152,6 +152,8 @@ func newConfig() *config { | |||
|
|||
fs.StringVar(&cfg.Dproxy, "discovery-proxy", cfg.Dproxy, "HTTP proxy to use for traffic to discovery service.") | |||
fs.StringVar(&cfg.DNSCluster, "discovery-srv", cfg.DNSCluster, "DNS domain used to bootstrap initial cluster.") | |||
fs.StringVar(&cfg.DNSClusterServiceNameSSL, "discovery-srv-name-ssl", cfg.DNSClusterServiceNameSSL, "Service name to query when using DNS discovery for SSL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we already know if etcd is using TLS at this point right, how can we get this information from etcd? Can you name your etcd service _etcd-server-ssl
then in the case of cluster 'a' cumulative SRV would be _etcd-server-ssl-a._tcp.service.consul. using
--discovery-srv-name a?
embed/config.go
Outdated
var clusterStrs []string | ||
var cerr error | ||
var serviceName string | ||
var scheme string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var (
clusterStrs []string
cerr error
serviceName string
scheme string
)?
@tavish-stripe this is a good idea please rework your logic based on a single flag. Looking forward to getting this into core! Let us know if you have any questions. |
@hexfusion that seems like a reasonable solution. One alternative would be to provide the name and have |
@tavish-stripe I feel the definition should be explicit but to your point flexibility is a consideration. defer to @xiang90 |
Another thing is that if we keep this |
Yeah, that seems like a reasonable argument. I'll wait for @xiang90 's opinion and then rework the patch. |
d388145
to
f44123b
Compare
f44123b
to
b664b91
Compare
@tavish-stripe since @xiang90 has not added a comment I will assume he has no strong feelings. I would like to keep the existing format |
sorry for the delay. i do not have strong feeling. just want to keep the flag down to one, and make it as clear as possible. we probably need to document it, and provide an example in the configuration doc as well. |
@hexfusion @xiang90 I've reworked the patch to use the suffix. Please take a look and let me know if anything needs changing.
Do I need to include that as part of this PR in the |
@tavish-stripe please include in this PR as a single commit. |
probably the documentation should be added here https://github.com/coreos/etcd/blob/master/Documentation/op-guide/clustering.md#dns-discovery |
@hexfusion @xiang90 I made a first cut of documentation for the new flag. Please let me know if there's anything I should change. |
@tavish-stripe This PR looks good to me in general. I would rely on @hexfusion and @gyuho or @spzala to give a final LGTM before merging. Thanks. |
+ Suffix to the DNS srv name queried when bootstrapping using DNS. | ||
+ default: "" | ||
+ env variable: ETCD_DISCOVERY_SRV | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tavish-stripe ETCD_DISCOVERY_SRV_NAME ?
@tavish-stripe one little knit, otherwise looks great! |
…service name in DNS discovery.
ca99cb2
to
81c9f78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks @tavish-stripe @hexfusion
👍 |
This PR adds the ability to configure the subdomain used for DNS cluster discovery. Previously, the domain was hardcoded to
_etcd-server._tcp.FOO
and_etcd-server-ssl._tcp.FOO
. This did not work with our DNS setup, so I've made this patch to allow configuring the subdomain portion of the SRV queries.We've been running this in production for a few months.