-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 4 commands for sentinel and update most test cases and json files #789
Add 4 commands for sentinel and update most test cases and json files #789
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #789 +/- ##
============================================
- Coverage 70.33% 70.30% -0.04%
============================================
Files 112 112
Lines 61489 61491 +2
============================================
- Hits 43249 43230 -19
- Misses 18240 18261 +21
|
142a1ad
to
737cd6a
Compare
src/sentinel.c
Outdated
@@ -4483,7 +4485,7 @@ void sentinelAskPrimaryStateToOtherSentinels(sentinelValkeyInstance *primary, in | |||
/* Ask */ | |||
ll2string(port, sizeof(port), primary->addr->port); | |||
retval = redisAsyncCommand( | |||
ri->link->cc, sentinelReceiveIsPrimaryDownReply, ri, "%s is-master-down-by-addr %s %s %llu %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old sentinel does not recognize the new command, but i suppose it is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, once we update this part, all sentinels should run on the same version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a rolling upgrade scenario for sentinels where the user runs mixed versions of sentinels temporarily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is. For example, this Helm chart https://github.com/bitnami/charts/tree/main/bitnami/valkey sets up e.g. a k8s StatefulSet with three pods running a sentinel and a valkey container each. During a RollingUpdate of the StatefulSet, there will be mixed versions of sentinel and valkey temporarily. The sentinels will still need to find a quorum and initiate failovers during this time (if the primary gets updated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense. @hwware I think we should leave this command as it is. Rolling upgrades can take time and we need to be resilient to any failures during this transitional period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense. @hwware I think we should leave this command as it is. Rolling upgrades can take time and we need to be resilient to any failures during this transitional period.
Agree with you. Because @gmbnomis provides me a real world case, let us just leave these commands as they are, util I find a solution to solve the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, sentinel accepts the new command variants, but it still uses the old ones internally? (which is "IS-MASTER-DOWN-BY-ADDR" only, if I get this correctly)
That looks reasonable. We could get rid of the old forms in some later release when we can be sure that all sentinel versions that are supported during an upgrade understand the new form.
Thank you for explicitly testing the mixed sentinel version upgrade scenario. Highly appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, sentinel accepts the new command variants, but it still uses the old ones internally? (which is "IS-MASTER-DOWN-BY-ADDR" only, if I get this correctly)
That looks reasonable. We could get rid of the old forms in some later release when we can be sure that all sentinel versions that are supported during an upgrade understand the new form.
Thank you for explicitly testing the mixed sentinel version upgrade scenario. Highly appreciated!
I keep the IS-MASTER-DOWN-BY-ADDR only internal, for user, we still encourage them to use new command format. Thanks
@@ -57,16 +57,16 @@ test "SENTINEL PENDING-SCRIPTS returns the information about pending scripts" { | |||
} | |||
|
|||
test "SENTINEL PRIMARIES returns a list of monitored primaries" { | |||
assert_match "*mymaster*" [S 0 SENTINEL MASTERS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess we also need some dummy tests that can cover the old commands, otherwise the reply-schemas-validator CI will fail (and we also need it to make sure the old commands can work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check https://github.com/hwware/valkey/actions/runs/9962102363/job/27525129293, it looks like reply-schemas-validator CI can accept all changes, and it passes. But adding some tests to cover old commands is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have somewhat forgotten the logic, but it does exist, maybe it's just not as strict now
https://github.com/hwware/valkey/actions/runs/9962102363/job/27525129293#step:11:70
WARNING! The following commands were not hit at all:
acl|cat
acl|dryrun
acl|genpass
acl|getuser
acl|help
acl|list
acl|load
acl|log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is because you just trigger a sentinel-test, it does not run the other tests, so those commands are not hit
tests/sentinel/tests/00-base.tcl
Outdated
@@ -200,7 +200,7 @@ test "SENTINEL RESET can resets the primary" { | |||
for {set j 0} {$j < 10} {incr j} { | |||
assert_equal 1 [S 0 SENTINEL RESET mymaster] | |||
set res1 [llength [S 0 SENTINEL SENTINELS mymaster]] | |||
set res2 [llength [S 0 SENTINEL SLAVES mymaster]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets keep it, i believe i add this for the old name coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, I will update it and add the comment for old name coverage.
969ecb1
to
ef82e8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall with just one pending question.
src/sentinel.c
Outdated
@@ -4483,7 +4485,7 @@ void sentinelAskPrimaryStateToOtherSentinels(sentinelValkeyInstance *primary, in | |||
/* Ask */ | |||
ll2string(port, sizeof(port), primary->addr->port); | |||
retval = redisAsyncCommand( | |||
ri->link->cc, sentinelReceiveIsPrimaryDownReply, ri, "%s is-master-down-by-addr %s %s %llu %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a rolling upgrade scenario for sentinels where the user runs mixed versions of sentinels temporarily?
According to the latest modification, there is no rolling upgrade scenario because for the redisAsyncCommand in old version, it will send is-master-down-by-addr command, however for new version sentinel, it will send is-primary-down-by-addr. In the case, if one sentinel node just upgrades, and it is "lucky" it detects the primary down, then it will send command "SENTINEL IS-PRIMARY-DOWN-BY-ADDR", however, old version sentinel can not recognize this command. Failover will not happen.
|
@hwware Please see my comment #789 (comment), which is a "real world" example of a rolling upgrade scenario which relies on failover to happen. |
Thanks for your information. Let us just leave these commands as they are now. |
ef82e8e
to
211d133
Compare
Signed-off-by: hwware <wen.hui.ware@gmail.com>
Signed-off-by: hwware <wen.hui.ware@gmail.com>
Signed-off-by: hwware <wen.hui.ware@gmail.com>
Signed-off-by: hwware <wen.hui.ware@gmail.com>
Signed-off-by: hwware <wen.hui.ware@gmail.com>
ecd4b2b
to
87c9469
Compare
I assume the major decision is the deprecation and renaming, which I am OK with. So just indicating my +1 on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Wen!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure I understand correctly:
The sentinels still use the MASTER commands (GET-MASTER-ADDR-BY-NAME, etc.) when they talk to each other, so we can support a rolling upgrade of sentinels.
In the future (Valkey 9?) the sentinels can use the PRIMARY variant of the commands instead. We still can't delete the MASTER commands, because we still want to support rolling upgrade of Sentinel from 8 to 9. Even later (Valkey 10?) we can delete the MASTER variants.
Correct?
Yes, when the communication between 2 sentinel nodes, the MASTER command is used.
I will find a way to delete the master command ASAP, maybe in Valkey 8.2 version or Valkey 9. Valkey 10 is too long (I hope not).
|
I see no hurry in deleting the deprecated commands, as long as there is a plan. I think we need to keep them as long as we support rolling upgrades from 8.0. |
…valkey-io#789) Add 4 new commands for Sentinel (reference valkey-io#36) Sentinel GET-PRIMARY-ADDR-BY-NAME Sentinel PRIMARY Sentinel PRIMARIES Sentinel IS-PRIMARY-DOWN-BY-ADDR and deprecate 4 old commands: Sentinel GET-MASTER-ADDR-BY-NAME Sentinel MASTER Sentinel MASTERS Sentinel IS-MASTER-DOWN-BY-ADDR and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583 Note: 1. runtest-sentinel pass all test cases 2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel there are 4 steps in this test scenario: step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully. step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully We can see, even mixed version sentinel running, whole system still works. --------- Signed-off-by: hwware <wen.hui.ware@gmail.com> Signed-off-by: mwish <maplewish117@gmail.com>
…valkey-io#789) Add 4 new commands for Sentinel (reference valkey-io#36) Sentinel GET-PRIMARY-ADDR-BY-NAME Sentinel PRIMARY Sentinel PRIMARIES Sentinel IS-PRIMARY-DOWN-BY-ADDR and deprecate 4 old commands: Sentinel GET-MASTER-ADDR-BY-NAME Sentinel MASTER Sentinel MASTERS Sentinel IS-MASTER-DOWN-BY-ADDR and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583 Note: 1. runtest-sentinel pass all test cases 2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel there are 4 steps in this test scenario: step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully. step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully We can see, even mixed version sentinel running, whole system still works. --------- Signed-off-by: hwware <wen.hui.ware@gmail.com> Signed-off-by: mwish <maplewish117@gmail.com>
…#789) Add 4 new commands for Sentinel (reference #36) Sentinel GET-PRIMARY-ADDR-BY-NAME Sentinel PRIMARY Sentinel PRIMARIES Sentinel IS-PRIMARY-DOWN-BY-ADDR and deprecate 4 old commands: Sentinel GET-MASTER-ADDR-BY-NAME Sentinel MASTER Sentinel MASTERS Sentinel IS-MASTER-DOWN-BY-ADDR and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583 Note: 1. runtest-sentinel pass all test cases 2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel there are 4 steps in this test scenario: step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully. step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully We can see, even mixed version sentinel running, whole system still works. --------- Signed-off-by: hwware <wen.hui.ware@gmail.com>
…#789) Add 4 new commands for Sentinel (reference #36) Sentinel GET-PRIMARY-ADDR-BY-NAME Sentinel PRIMARY Sentinel PRIMARIES Sentinel IS-PRIMARY-DOWN-BY-ADDR and deprecate 4 old commands: Sentinel GET-MASTER-ADDR-BY-NAME Sentinel MASTER Sentinel MASTERS Sentinel IS-MASTER-DOWN-BY-ADDR and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583 Note: 1. runtest-sentinel pass all test cases 2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel there are 4 steps in this test scenario: step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully. step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully We can see, even mixed version sentinel running, whole system still works. --------- Signed-off-by: hwware <wen.hui.ware@gmail.com>
Add 4 new commands for Sentinel (reference #36)
Sentinel GET-PRIMARY-ADDR-BY-NAME
Sentinel PRIMARY
Sentinel PRIMARIES
Sentinel IS-PRIMARY-DOWN-BY-ADDR
and deprecate 4 old commands:
Sentinel GET-MASTER-ADDR-BY-NAME
Sentinel MASTER
Sentinel MASTERS
Sentinel IS-MASTER-DOWN-BY-ADDR
and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583
Note:
runtest-sentinel pass all test cases
I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel
there are 4 steps in this test scenario:
step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully.
step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully
step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully
step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully
We can see, even mixed version sentinel running, whole system still works.