-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix rpc.get_candidates
function
#1575
Comments
The original problem was Currently there is no reliable way to know which config is applied on different replicas. All solutions boil down to increasing chances of success. But there is no guarantee. @olegrok suggests to use topology from applied configuration and specify uri manually. See #1588 for details. Also consider usage of retries and be vigilant. |
What is the reason for that triage? |
You can't rely on a result of
|
You're wrong! In our case, all instances are alive and in the same state when both functions are called. It's just that the function get_candidates does not make a hard enough check. So, stateful failover and rpc module calls will not work |
What kind of checks do you mean? |
I don't remember a place with this check, but if you make same actions as in issue, you will get error message, generated by this check |
There is a raice condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait untill it will be finished. Closes #1575
There is a raise condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait until it will be finished. Closes #1575
There is a raise condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait until it will be finished. Closes #1575
There is a raise condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait until it will be finished. Closes #1575
There is a raise condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait until it will be finished. Closes #1575
There is a raise condition. We could commit config locally but it could be in progress on some instance. Before this patch user got unexpected "Role X unavailable" from instance where such role was assumed. Solution is an optimistic approach - detect config apply and try to wait until it will be finished. Closes #1575
Sometimes the code above may log
["localhost:11002"]
andRemoteCallError: "localhost:11002": Role "my_role" unavailable
simultaneously. It means thatrpc.get_candidates
returns those instances that are not actually alive, i.e. nothing can be done on them. So, it's incorrect behavior ofrpc.get_candidates
.This seems to be due to the slow writing of the configuration backup to disk. As a result, we may have such that all instances are in the
RolesConfigured
status, but the config has not yet begun to be applied to some of the instances (in the example it'slocalhost:11002
).To reproduce this error, you can:
create 2 instances (with 11001 and 11002 ports);
assign
my_role
only to the second instance;insert code:
to the 135 line in
cartridge/twophase.lua
file:cartridge/cartridge/twophase.lua
Lines 130 to 136 in e887629
The text was updated successfully, but these errors were encountered: