Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[leo_manager] Inconsistent storage node status #315

Closed
mocchira opened this issue Feb 26, 2015 · 2 comments
Closed

[leo_manager] Inconsistent storage node status #315

mocchira opened this issue Feb 26, 2015 · 2 comments

Comments

@mocchira
Copy link
Member

[E]     leofs12_0@192.168.100.12        2015-02-25 17:10:11.406757 +0900        1424851811      null:null       0       gen_server leo_manager_cluster_monitor terminated with reason: {timeout,{gen_server,call,[leo_redundant_manager,{update_member_by_node,'leofs14@192.168.100.14',1424851781405057,attached},30000]}} in gen_server:call/3 line 190
[E]     leofs12_0@192.168.100.12        2015-02-25 17:10:11.407673 +0900        1424851811      null:null       0       ["CRASH REPORT ",[80,114,111,99,101,115,115,32,"leo_manager_cluster_monitor",32,119,105,116,104,32,"0",32,110,101,105,103,104,98,111,117,114,115,32,"exited",32,119,105,116,104,32,114,101,97,115,111,110,58,32,[[123,["timeout",44,[123,["gen_server",44,"call",44,[91,["leo_redundant_manager",44,[123,["update_member_by_node",44,"'leofs14@192.168.100.14'",44,"1424851781405057",44,"attached"],125],44,"30000"],93]],125]],125]," in ",[["gen_server",58,"terminate",47,"7"],[32,108,105,110,101,32,"804"]]]]]
[E]     leofs12_0@192.168.100.12        2015-02-25 17:10:11.408403 +0900        1424851811      null:null       0       Supervisor leo_manager_sup had child leo_manager_cluster_monitor started with leo_manager_cluster_monitor:start_link() at <0.638.0> exit with reason {timeout,{gen_server,call,[leo_redundant_manager,{update_member_by_node,'leofs14@192.168.100.14',1424851781405057,attached},30000]}} in context child_terminated
[E]     leofs12_0@192.168.100.12        2015-02-25 17:10:27.962202 +0900        1424851827      leo_membership_cluster_local:notify_error_to_manager/3  432     {'leofs12_0@192.168.100.12',{badrpc,timeout}}

In this case,
The storage node which coudn't update its status to attached will be not shown as running but as attached after starting LeoFS cluster.

@mocchira
Copy link
Member Author

There are two solutions.

  1. Retry update_member_by_node
  2. Reduce accesses to leo_redundant_manager(gen_server)

Applying both of them would be best.

@mocchira
Copy link
Member Author

The root cause of #317 could affect this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants