Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Failed to add a new member node4 to a three-nodes xline cluster #661

Closed
1 task done
Phoenix500526 opened this issue Feb 27, 2024 · 0 comments · Fixed by #658
Closed
1 task done

[Bug]: Failed to add a new member node4 to a three-nodes xline cluster #661

Phoenix500526 opened this issue Feb 27, 2024 · 0 comments · Fixed by #658
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Phoenix500526
Copy link
Collaborator

Description about the bug

Xline cluster fails to execute member add.

Following the instructions in quick_start/README.md, executing the member add operation on a three-node Xline cluster results in the new node failing to start.

Reproduction steps are as follows:

  1. Start a three-node cluster and an etcd-client using quick-start.sh.
$ ./scripts/quick_start.sh 
 [INFO] stopping 
Error response from daemon: No such container: prometheus
 [INFO] stopped 
 [WARN] A Docker network named 'xline_net' is created for communication among various xline nodes. You can use the command 'docker network rm xline_net' to remove it after use. 
 [INFO] container starting 
ecdf4cce22a5ee00054802eb6478e07549f3dfb29497022930af01aa14ce2ce0
d378be09cdca037a6a5984bc411848e602c374e679bb9de1fbca0e8899ddda60
adceb1e6b9849e5a6f28e776abdf64d1dc38e4d3032b6f34d944c160a91a06ef
cf49683ddc0372fd8e9df331a479cce369d852aa161bd3e437fc7d980ef1ab9c
 [INFO] container started 
 [INFO] cluster starting 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node3 /usr/local/bin/xline     --name node3     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.5:2379     --peer-listen-urls=http://172.20.0.5:2380,http://172.20.0.5:2381     --client-advertise-urls=http://172.20.0.5:2379     --peer-advertise-urls=http://172.20.0.5:2380,http://172.20.0.5:2381 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node1 /usr/local/bin/xline     --name node1     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.3:2379     --peer-listen-urls=http://172.20.0.3:2380,http://172.20.0.3:2381     --client-advertise-urls=http://172.20.0.3:2379     --peer-advertise-urls=http://172.20.0.3:2380,http://172.20.0.3:2381 --is-leader 
 [INFO] command is: docker exec -e RUST_LOG=debug -d node2 /usr/local/bin/xline     --name node2     --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381     --storage-engine rocksdb     --data-dir /usr/local/xline/data-dir     --auth-public-key /mnt/public.pem     --auth-private-key /mnt/private.pem     --client-listen-urls=http://172.20.0.4:2379     --peer-listen-urls=http://172.20.0.4:2380,http://172.20.0.4:2381     --client-advertise-urls=http://172.20.0.4:2379     --peer-advertise-urls=http://172.20.0.4:2380,http://172.20.0.4:2381 
 [INFO] cluster started 
ebdba0fd66f42ac9910276ac45cfdc187f90fb6e32a51f6d2799186a4b9e4119
Prometheus starts on http://172.20.0.6:9090/graph and http://127.0.0.1:9090/graph
  1. Use etcdctl to execute the member add operation.
$ docker exec client /bin/sh -c "/usr/local/bin/etcdctl --endpoints=\"http://172.20.0.3:2379\" member add node4 --peer-urls=http://172.20.0.17:2380,http://172.20.0.17:2381"
Member 80fff8e371b58d12 added to cluster 425b5b944b259215

ETCD_NAME="node4"
ETCD_INITIAL_CLUSTER="node4=http://172.20.0.17:2380,node4=http://172.20.0.17:2381,node2=172.20.0.4:2380,node2=172.20.0.4:2381,node1=172.20.0.3:2380,node1=172.20.0.3:2381,node3=172.20.0.5:2380,node3=172.20.0.5:2381"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.20.0.17:2380,http://172.20.0.17:2381"
ETCD_INITIAL_CLUSTER_STATE="existing"
  1. Boot up node4
$ docker run -d -it --rm --name=node4 --net=xline_net --ip=172.20.0.17 --cap-add=NET_ADMIN --cpu-shares=1024 -m=512M -v ./scripts:/mnt ghcr.io/xline-kv/xline:latest bash
f4818b022e21351ddea240d1c974db056363f4bc9247c163c70531ecf284147e
  1. Start up a new xline node
$ docker exec -it node4 /bin/bash
root@f4818b022e21:/# /usr/local/bin/xline --name node4 --members node1=172.20.0.3:2380,172.20.0.3:2381,node2=172.20.0.4:2380,172.20.0.4:2381,node3=172.20.0.5:2380,172.20.0.5:2381,node4=172.20.0.17:2380,172.20.0.17:2381 --storage-engine rocksdb --data-dir /usr/local/xline/data-dir --auth-public-key /mnt/public.pem --auth-private-key /mnt/private.pem --client-listen-urls=http://172.20.0.17:2379 --peer-listen-urls=http://172.20.0.17:2381,http://172.20.0.17:2380 --client-advertise-urls=http://172.20.0.17:2379 --peer-advertise-urls=http://172.20.0.17:2381,http://172.20.0.17:2380 --initial-cluster-state=existing
thread 'main' panicked at 'self_id should not be 0', /home/jiawei/Xline/crates/curp/src/members.rs:155:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In step 4 above, xline startup fails, and the logs of the failure are as follows

Version

0.6.1 (Default)

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Phoenix500526 Phoenix500526 added the bug Something isn't working label Feb 27, 2024
@Phoenix500526 Phoenix500526 added this to the v0.7.0 milestone Feb 27, 2024
@Phoenix500526 Phoenix500526 self-assigned this Mar 11, 2024
@Phoenix500526 Phoenix500526 linked a pull request May 13, 2024 that will close this issue
Phoenix500526 added a commit to Phoenix500526/Xline that referenced this issue May 13, 2024
Closes: xline-kv#661
Signed-off-by: Phoeniix Zhao <Phoenix500526@163.com>
Phoenix500526 added a commit to Phoenix500526/Xline that referenced this issue May 14, 2024
Closes: xline-kv#661
Signed-off-by: Phoeniix Zhao <Phoenix500526@163.com>
Phoenix500526 added a commit to Phoenix500526/Xline that referenced this issue May 20, 2024
Closes: xline-kv#661
Signed-off-by: Phoeniix Zhao <Phoenix500526@163.com>
Phoenix500526 added a commit to Phoenix500526/Xline that referenced this issue May 22, 2024
Closes: xline-kv#661
Signed-off-by: Phoeniix Zhao <Phoenix500526@163.com>
@mergify mergify bot closed this as completed in #658 May 22, 2024
mergify bot pushed a commit that referenced this issue May 22, 2024
Closes: #661
Signed-off-by: Phoeniix Zhao <Phoenix500526@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant