-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd shouldn't permit duplicate node names in ETCD_INITIAL_CLUSTER #7927
Comments
Maybe etcd should treat this as multiple peer URLs for the one member? It definitely shouldn't drop input on the ground like that, at least. /cc @xiang90 |
@alexzorin I'm not seeing this behavior. I tried starting a new cluster: ./bin/etcd -name etcd -initial-cluster "etcd=http://127.0.0.1:2380,etcd=http://10.7.29.60:2380" --initial-advertise-peer-urls "http://127.0.0.1:2380,http://10.7.29.60:2380" I see both peer addresses in 2017-06-09 16:35:07.125761 I | etcdmain: etcd Version: 3.2.0-rc.1+git
2017-06-09 16:35:07.125950 I | etcdmain: Git SHA: 933aa09
2017-06-09 16:35:07.125957 I | etcdmain: Go Version: go1.8
2017-06-09 16:35:07.125962 I | etcdmain: Go OS/Arch: darwin/amd64
2017-06-09 16:35:07.125969 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2017-06-09 16:35:07.125977 N | etcdmain: failed to detect default host (default host not supported on darwin_amd64)
2017-06-09 16:35:07.125986 W | etcdmain: no data-dir provided, using default data-dir ./etcd.etcd
2017-06-09 16:35:07.126722 I | embed: listening for peers on http://localhost:2380
2017-06-09 16:35:07.126955 I | embed: listening for client requests on localhost:2379
INITIAL PEERURLSMAP in NewServer: etcd=http://10.7.29.60:2380,etcd=http://127.0.0.1:2380
2017-06-09 16:35:07.128339 I | etcdserver: name = etcd
2017-06-09 16:35:07.128353 I | etcdserver: data dir = etcd.etcd
2017-06-09 16:35:07.128358 I | etcdserver: member dir = etcd.etcd/member
2017-06-09 16:35:07.128361 I | etcdserver: heartbeat = 100ms
2017-06-09 16:35:07.128364 I | etcdserver: election = 1000ms
2017-06-09 16:35:07.128368 I | etcdserver: snapshot count = 100000
2017-06-09 16:35:07.128377 I | etcdserver: advertise client URLs = http://localhost:2379
2017-06-09 16:35:07.128384 I | etcdserver: initial advertise peer URLs = http://10.7.29.60:2380,http://127.0.0.1:2380
2017-06-09 16:35:07.128390 I | etcdserver: initial cluster = etcd=http://10.7.29.60:2380,etcd=http://127.0.0.1:2380
2017-06-09 16:35:07.211158 I | etcdserver: starting member 22730d60c7d1e6bc in cluster fe7127deeb881c7a
2017-06-09 16:35:07.211221 I | raft: 22730d60c7d1e6bc became follower at term 0
2017-06-09 16:35:07.211239 I | raft: newRaft 22730d60c7d1e6bc [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2017-06-09 16:35:07.211246 I | raft: 22730d60c7d1e6bc became follower at term 1
CLUSTER: {ClusterID:fe7127deeb881c7a Members:[&{ID:22730d60c7d1e6bc RaftAttributes:{PeerURLs:[http://10.7.29.60:2380 http://127.0.0.1:2380]} Attributes:{Name:etcd ClientURLs:[]}}] RemovedMemberIDs:[]}
2017-06-09 16:35:07.213054 W | auth: simple token is not cryptographically signed
2017-06-09 16:35:07.213562 I | etcdserver: starting server... [version: 3.2.0-rc.1+git, cluster version: to_be_decided]
2017-06-09 16:35:07.214576 E | etcdserver: cannot monitor file descriptor usage (cannot get FDUsage on darwin)
2017-06-09 16:35:07.215555 I | etcdserver/membership: added member 22730d60c7d1e6bc [http://10.7.29.60:2380 http://127.0.0.1:2380] to cluster fe7127deeb881c7a
2017-06-09 16:35:08.113146 I | raft: 22730d60c7d1e6bc is starting a new election at term 1
2017-06-09 16:35:08.113314 I | raft: 22730d60c7d1e6bc became candidate at term 2
2017-06-09 16:35:08.113362 I | raft: 22730d60c7d1e6bc received MsgVoteResp from 22730d60c7d1e6bc at term 2
2017-06-09 16:35:08.113392 I | raft: 22730d60c7d1e6bc became leader at term 2
2017-06-09 16:35:08.113404 I | raft: raft.node: 22730d60c7d1e6bc elected leader 22730d60c7d1e6bc at term 2
2017-06-09 16:35:08.113644 I | etcdserver: setting up the initial cluster version to 3.2
2017-06-09 16:35:08.128898 N | etcdserver/membership: set the initial cluster version to 3.2
2017-06-09 16:35:08.128930 I | etcdserver: published {Name:etcd ClientURLs:[http://localhost:2379]} to cluster fe7127deeb881c7a
2017-06-09 16:35:08.129027 I | embed: ready to serve client requests
2017-06-09 16:35:08.129279 I | etcdserver/api: enabled capabilities for version 3.2
2017-06-09 16:35:08.129505 N | embed: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged! If I restart etcd with the same arguments I see the initial peers map is cleared out (but the information is already stored in the data directory)
So I can't get it to drop nodes from the map while keeping the last one. What am I missing? |
I can't remember the exact reproduction now as its been a while, but I believe the name flattening is still a problem both in v3.1.3 (where I had the actual production issue) and on master. I think this is not quite the same error I had but is symptomatic of the same issue (a notable difference to your repro attempt is that I only provide one peer for I think that the final error doesn't make sense, and I think this is because of the duplicate node names inadvertently missing peers on the list. Either renaming the duplicate node names (which was my production fix), or copying the peer list fully to the advertise-peer-urls variable (which you did in your repro) steps around the issue.
|
What error? The Is there actually a bug here or can this be closed? |
OK, I see the bug with --initial-cluster; it should be giving |
…se urls The old error was not clear about what URLs needed to be added, sometimes truncating the list. To make it clearer, print out the missing entries for --initial-cluster and print the full list of initial advertise peers. Fixes etcd-io#8079 and etcd-io#7927
…se urls The old error was not clear about what URLs needed to be added, sometimes truncating the list. To make it clearer, print out the missing entries for --initial-cluster and print the full list of initial advertise peers. Fixes etcd-io#8079 and etcd-io#7927
Fixed by #8083, closing. |
…se urls The old error was not clear about what URLs needed to be added, sometimes truncating the list. To make it clearer, print out the missing entries for --initial-cluster and print the full list of initial advertise peers. Fixes etcd-io#8079 and etcd-io#7927
In etcd 3.0 and 3.1,
If a user provides
ETCD_INITIAL_CLUSTER
that has duplicate names, e.g.then etcd will happily accept this configuration, but will silently turn transform
InitialPeerURLsMap
into a map with a single entry of whatever the final item is, i.e.map[string]string{"etcd":"https://3.4.5.6:2380"}
:This leads to some very confusing error messages down the line when trying to join a cluster, because
InitialPeerURLsMap
ends up being something very different to what the user was expecting.Probably etcd should reject such a configuration immediately, and perhaps in other locations, uniqueness should be enforced - as I had a perfectly functional cluster with duplicate names, until I had to replace a node.
The text was updated successfully, but these errors were encountered: