Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

用pika3.3.6 主从切换以后slaveof xx xx force 不触发数据同步 rsync也没调用起来 [from QQ group] #1265

Closed
kernelai opened this issue Feb 14, 2023 · 9 comments

Comments

@kernelai
Copy link
Collaborator

数据量少的时候可以 数据几百g以后就触发不了。
image

@huadaonan
Copy link

huadaonan commented Feb 14, 2023

slave:172.xxx.11.133:9224

  • pika.info:
    I0214 08:17:23.274706 186851 pika_server.cc:176] Using Networker Interface: eth0
    I0214 08:17:23.274785 186851 pika_server.cc:219] host: 172.xxx.11.133 port: 9224
    I0214 08:17:23.274811 186851 pika_server.cc:95] Worker queue limit is 725
    I0214 08:17:23.275035 186851 pika_binlog.cc:111] Binlog: Find the exist file.
    I0214 08:17:26.537572 186851 pika_partition.cc:87] db0 DB Success
    I0214 08:17:26.538264 186851 pika_server.cc:273] Pika Server going to start
    I0214 08:17:26.539250 186932 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (172.xxx.131.174:9224)
    I0214 08:17:26.545603 186860 pika_server.cc:618] Mark try connect finish
    I0214 08:17:26.545622 186860 pika_repl_client_conn.cc:146] Finish to handle meta sync response
    I0214 08:17:26.680630 186861 pika_repl_client_conn.cc:261] Partition: db0 Need To Try DBSync
    I0214 08:17:26.794219 186862 pika_repl_client_conn.cc:182] Partition: db0 Need Wait To Sync
    I0214 08:17:52.292177 186892 pika_rm.cc:90] Remove Slave Node, Partition: (db0:0), ip_port: 172.xxx.131.174:9224
    I0214 08:17:52.292219 186892 pika_server.cc:873] Remove Master Success, ip_port: 172.xxx.131.174:9224
    I0214 08:17:53.291977 186859 pika_repl_client_thread.cc:21] ReplClient Close conn, fd=71, ip_port=172.xxx.131.174:11224
    I0214 08:17:58.563925 186932 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (172.xxx.131.174:9224)
    I0214 08:17:58.567683 186863 pika_server.cc:618] Mark try connect finish
    I0214 08:17:58.567711 186863 pika_repl_client_conn.cc:146] Finish to handle meta sync response
    I0214 08:17:58.712963 186864 pika_repl_client_conn.cc:182] Partition: db0 Need Wait To Sync

  • -rsync.log:

    • 2023/02/14 08:18:01 [187045] connect from (172.xxx.131.174)

master:172.xxx.131.174:9224
pika.WARNING
W0214 08:18:01.029573 150322 pika_server.cc:1124] Partition: db0 RSync send file failed! From: strings, To: db0/strings/, At: 172.xxx.11.133:10224, Error: -1

@kernelai
Copy link
Collaborator Author

检查下slave的rsync进程是否还存在? 大概率是原来的master down之后,rsync的pid文件没有被删除。

@kernelai
Copy link
Collaborator Author

手动kill掉 slave的rsync 进程,注意确保rsync的pid 文件也被删除。此时slave pika 会自动拉起rsync进程。

@huadaonan
Copy link

对 就是这么做的。rsync进程都在,也杀掉了进程。重启了从库slave,rsync也启动了,然后rsync.log没有报错。继续执行
slave of no one
slave of master_ip port force
然后就报了上面的错误

@kernelai kernelai reopened this Feb 14, 2023
@kernelai
Copy link
Collaborator Author

kernelai commented Feb 14, 2023

W0214 08:18:01.029573 150322 pika_server.cc:1124] Partition: db0 RSync send file failed! From: strings, To: db0/strings/, At: 172.xxx.11.133:10224, Error: -1
目前可以确认是rsync 发送阶段的问题,检查下 master、 slave的rsync 日志有异常吗?
主从的密码是正确的吗?

@huadaonan
Copy link

重启后的日志:
slave: pika.warnning
I0214 08:34:23.471767 186899 pika_rm.cc:90] Remove Slave Node, Partition: (db0:0), ip_port: 172.xxx.31.174:9221
I0214 08:34:23.471808 186899 pika_server.cc:873] Remove Master Success, ip_port: 172.22.31.174:9221
I0214 08:34:23.599779 186859 pika_repl_client_thread.cc:21] ReplClient Close conn, fd=71, ip_port=172.xxx.31.174:11221
I0214 08:34:37.892572 186932 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (172.xxx.31.174:9221)
I0214 08:34:37.896386 186865 pika_server.cc:618] Mark try connect finish
I0214 08:34:37.896405 186865 pika_repl_client_conn.cc:146] Finish to handle meta sync response
I0214 08:34:38.034626 186866 pika_repl_client_conn.cc:261] Partition: db0 Need To Try DBSync
I0214 08:34:38.136291 186867 pika_repl_client_conn.cc:182] Partition: db0 Need Wait To Sync

rsync.log
2023/02/14 08:18:01 [187045] connect from ip-172-xxx-31-174.ap-southeast-1.compute.internal (172.xxx.31.174)
2023/02/14 08:34:43 [188551] connect from ip-172-xxx-31-174.ap-southeast-1.compute.internal (172.xxx.31.174)

master:
W0214 08:18:01.029573 150322 pika_server.cc:1124] Partition: db0 RSync send file failed! From: strings, To: db0/strings/, At: 172.xxx.1.133:10221, Error: -1
W0214 08:34:43.618261 150322 pika_server.cc:1124] Partition: db0 RSync send file failed! From: strings, To: db0/strings/, At: 172.xxx.1.133:10221, Error: -1

@huadaonan
Copy link

主从没设置密码,rsync同步是用的系统自带生成的

@AlexStocks
Copy link
Contributor

目前暂时没有定位到问题根因

@luky116 luky116 closed this as completed May 19, 2023
@banlilin
Copy link

这个问题没解决就被关掉了。。解决办法是啥呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants