Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUNSUBSCRIBE corner case causes client out-of-sync #1066

Open
zuiderkwast opened this issue Sep 23, 2024 · 14 comments
Open

SUNSUBSCRIBE corner case causes client out-of-sync #1066

zuiderkwast opened this issue Sep 23, 2024 · 14 comments

Comments

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Sep 23, 2024

When a client is subscribed to a sharded-pubsub channel in cluster mode and the slot is moved to another shard, or deleted, the client receives a spontaneous sunsubscribe push message.

If a client has just sent a SUNSUBSCRIBE command, the client cannot know if the sunsubscribe message is a response to the command or a sponaneous message.

In the following scenario, client 1 believes that the [sunsubscribe, ch, 0] push message is received as a response to SUNSUBSCRIBE. The CLUSTERDOWN error reply remains to be read and it appears to be out of sync, i.e. to the client it doesn't appear to match a command it has sent. (If the client has sent another command in the pipeline, the CLUSTERDOWN appears to be its reply.)

Client 1                Client 2            Primary
  |                          |                 |
  |                          |  DELSLOT        |
  |                          |---------------->|
  | SUNSUBSCRIBE ch          |                 |
  |------------------------------------------->|
  |                          |      OK         |
  | [sunsubscribe, ch, 0]    |<----------------|
  |<-------------------------------------------|
  |                          |                 |
  | -CLUSTERDOWN             |                 |
  |<-------------------------------------------|
  |                          |                 |

Originally posted by @zuiderkwast in #759 (comment) (but edited)

Test case

This test case passes, i.e. it illustrates what the clients actually see.

diff --git a/tests/unit/cluster/pubsubshard-slot-migration.tcl b/tests/unit/cluster/pubsubshard-slot-migration.tcl
index c5a324f09..26d6afe56 100644
--- a/tests/unit/cluster/pubsubshard-slot-migration.tcl
+++ b/tests/unit/cluster/pubsubshard-slot-migration.tcl
@@ -187,6 +187,32 @@ test "Delete a slot, verify sunsubscribe message" {
     $subscribeclient close
 }
 
+test "Slot deleted and unsubscribed simulaneously" {
+    set channelname ch5
+    set slot [$cluster cluster keyslot $channelname]
+
+    array set primary_client [$cluster masternode_for_slot $slot]
+
+    set subscribeclient [valkey_deferring_client_by_addr $primary_client(host) $primary_client(port)]
+    $subscribeclient HELLO 3
+    $subscribeclient read
+    $subscribeclient SSUBSCRIBE $channelname
+    $subscribeclient read
+
+    # Delete a slot.
+    assert_equal OK [$primary_client(link) CLUSTER DELSLOTS $slot]
+
+    # Send in a pipeline SUNSUBSCRIBE and DBSIZE
+    $subscribeclient SUNSUBSCRIBE $channelname
+    $subscribeclient DBSIZE
+
+    # We expect one reply per command, plus an implicit sunsubscribed message.
+    assert_equal "sunsubscribe $channelname 0" [$subscribeclient read]
+    catch {$subscribeclient read} e
+    assert_equal "CLUSTERDOWN Hash slot not served" $e
+    assert_equal 0 [$subscribeclient read]
+}
+
 test "Reset cluster, verify sunsubscribe message" {
     set channelname ch4
     set slot [$cluster cluster keyslot $channelname]
@ranshid
Copy link
Member

ranshid commented Sep 24, 2024

Nice!

I guess this could also be solved when we introduce atomic slot migration right?

@zuiderkwast
Copy link
Contributor Author

I guess this could also be solved when we introduce atomic slot migration right?

No, how would that solve it? This is about the subscribed client getting a spontaneous message when the slot migration happens.

@ikolomi
Copy link

ikolomi commented Sep 24, 2024

Please note that although cumbersome, the resulting local subscription state could be tracked correctly - receiving SUNSUBSCRIBE notification (regardless if caused by the an active command or due to slot migration) allows client to know that it will no longer receive notifications on the channel in question.

If a client has just sent a SUNSUBSCRIBE command, the client cannot know if the sunsubscribe message is a response to the command or a sponaneous message.

Thus, the above does not really matter, although indeed complicates the unsubscription logic.

@zuiderkwast
Copy link
Contributor Author

Thus, the above does not really matter, although indeed complicates the unsubscription logic.

@ikolomi What do you mean it does not really matter? The -CLUSTERDOWN (or MOVED) is still in the reply pipeline. If the client sends a GET foo or any other command after the SUNSUBSCRIBE, the CLUSTERDOWN or MOVED will appear to be the reply to that GET command. Can you explain how a client can tell whether the error is for the SUNSUBSCRIBE or the GET?

@ranshid
Copy link
Member

ranshid commented Sep 25, 2024

I guess this could also be solved when we introduce atomic slot migration right?

No, how would that solve it? This is about the subscribed client getting a spontaneous message when the slot migration happens.

Oh I assumed that during atomic slot ownership the command will be blocked until we identify the ownership was migrated so MOVED response will be sent, but you are right that it can still be a problem if we allow unblocking and running the command so the client would get MOVED error which can still be identified as a response for another command.

@ranshid
Copy link
Member

ranshid commented Sep 25, 2024

I guess this could also be solved when we introduce atomic slot migration right?

No, how would that solve it? This is about the subscribed client getting a spontaneous message when the slot migration happens.

Oh I assumed that during atomic slot ownership the command will be blocked until we identify the ownership was migrated so MOVED response will be sent, but you are right that it can still be a problem if we allow unblocking and running the command so the client would get MOVED error which can still be identified as a response for another command.

BTW I wonder if preventing MOVED error and just reply [sunsubscribe, ch, 0] again would make things better? I mean there is no real point in redirecting unsubscribe messages to a different node IMO.

@zuiderkwast
Copy link
Contributor Author

Ran, we already get a MOVED today in this race condition. (My example above uses CLUSTERDOWN instead because I deleted the slot instead of migrated it, but it's the only difference.) The client just doesn't understand that this MOVED is the reply to the SUNSUBSCRIBE command it just sent, because there is a 'sunsubscribe' push message in the reply pipeline which arrives to the client first.

When the client is subscribed and doesn't do anything, when a slot is migrated we send an 'sunsubscribed' push notification to the client.

@ranshid
Copy link
Member

ranshid commented Sep 25, 2024

Ran, we already get a MOVED today in this race condition. (My example above uses CLUSTERDOWN instead because I deleted the slot instead of migrated it, but it's the only difference.) The client just doesn't understand that this MOVED is the reply to the SUNSUBSCRIBE command it just sent, because there is a 'sunsubscribe' push message in the reply pipeline which arrives to the client first.

When the client is subscribed and doesn't do anything, when a slot is migrated we send an 'sunsubscribed' push notification to the client.

Thank you @zuiderkwast, I understand the point. So the client would probably have a double reroute in the case he was pipelining more requests (like ssubscribe or other commands in case of resp3) right? Probably double moved will also cause the client to perform new topology refresh which in turn can lead to high load on the server if there are many subscribers on the same slot/s. I was only thinking if instead of MOVED we could just return that there are zero subscribers on the channel, but later I also thought it might still break some stateless clients. I currently have no good solution for that.

@zuiderkwast
Copy link
Contributor Author

Load on the server is not the issue. The client getting out of sync and returning the wrong replies back to the user application is.

The best way to avoid this is probably to just never send UNSUBSCRIBE commands. That's what GLIDE does.

It seems a bit broken though and it'd be nicer if the behaviour were different, like if every command would always get one in-band reply (in RESP3). That'd have to be client opt-in though. If we do this, then we can send +OK as a reply to the subscribe/unsubscribe commands when successful and the push replies can be handled completely out-of-band as they were intended to.

@zuiderkwast
Copy link
Contributor Author

To clarify client getting out of sync: An async client sends a constant stream of commands, receives the stream of repsonseses and needs to match each response with the command it corresponds to, to be able to invoke the right callback or otherwise match it with the code that sent each command. That's what GLIDE does (iiuc) and that's what hiredis does in the async API.

If a client sends GET foo; UNSUBSCRIBE bar; DBSIZE; PING; SET x y and the server responds with "bar"; ("unsubscribed", "bar", 0); -MOVED 1234 xx:6379; 72329; "PONG"; "OK", the client has no way to see that 72329 is the reply to DBSIZE and "PONG" is the reply to PING. It gets out of sync and returns "MOVED" to DBSIZE, 72329 to PING, "PONG" to SET..

@ikolomi
Copy link

ikolomi commented Sep 25, 2024

@zuiderkwast

Thus, the above does not really matter, although indeed complicates the unsubscription logic.

@ikolomi What do you mean it does not really matter? The -CLUSTERDOWN (or MOVED) is still in the reply pipeline. If the client sends a GET foo or any other command after the SUNSUBSCRIBE, the CLUSTERDOWN or MOVED will appear to be the reply to that GET command. Can you explain how a client can tell whether the error is for the SUNSUBSCRIBE or the GET?

I mean that the current subscription state could still be correctly constructed on the client side.
Now, regarding the ordering of the commands - Yes, it seems to me that the protocol is broken because the unsubscribe commands return either error or cause a RESP notification. Since the notifications are server originated and asynchronous, it might be impossible to associate the unsubscribe command with the unsubscribe notifications.
One way to deal with it is by the following logic:

  • drain the reply pipeline
  • send the unsubscribe command
  • wait for an error reply, timeout or unsubscribe notification
  • if timed out assume the command succeeded but no channels unsubscribed -> assume client is not subscribed on the channel
  • if error reply, associate the error with the unsubscribe command -> be on the safe side, assume client subscribed and tear down the connection to ensure unsubscription.
  • if unsubscribe notification -> assume client is not subscribed on the channel

The problematic parts here are

  1. How long to wait for the notification before assuming the command succeeded and the next reply will not be for this command (As in your example)
  2. Between the draining and the sending of the unsubscribe command, the server might send an unsubscribe notification, breaking the above logic (as the notification will incorrectly indicate a successful command)

Due to these complications, we dont use unsubscription commands in GLIDE, relying on the transport tear down instead.

@zuiderkwast
Copy link
Contributor Author

if timed out assume the command succeeded but no channels unsubscribed -> assume client is not subscribed on the channel

If no channels are subscribed, the server still sends one notification on the form (unsubscribe, ch, 0) to an UNSUBSCRIBE command (or, since valkey 8.0.0, a -NOSUB error), so I don't think timeout can happen.

  1. Between the draining and the sending of the unsubscribe command, the server might send an unsubscribe notification, breaking the above logic

Yes, this part can still happen even with draining.

@ranshid
Copy link
Member

ranshid commented Sep 26, 2024

Load on the server is not the issue. The client getting out of sync and returning the wrong replies back to the user application is.

The best way to avoid this is probably to just never send UNSUBSCRIBE commands. That's what GLIDE does.

It seems a bit broken though and it'd be nicer if the behaviour were different, like if every command would always get one in-band reply (in RESP3). That'd have to be client opt-in though. If we do this, then we can send +OK as a reply to the subscribe/unsubscribe commands when successful and the push replies can be handled completely out-of-band as they were intended to.

@zuiderkwast I Understand the problem you refer to and thank you for the patience putting up with my questions :)
The way Glide solves the issue is by totally disallowing users from partial unsubscription on a client so they have to teardown the client (@ikolomi correct me if I am wrong). this is somewhat suboptimal IMO but maybe there aren't many such usecases. One I can think of is when the user places temporal keyspace notifications on specific keys (during some sort of client side caching) and removing them after the notification arrives.
I like the suggestion you refer to in resp3 to return the response in-band and the notifications in push. we can even provide an option to disconnect the resp2 subscribers for the channel when a slot is migrated. Do you think it is reasonable to extend the CLIENT CAPA to opt-in to this behavior? (maybe as an addition to #962)?

@zuiderkwast
Copy link
Contributor Author

zuiderkwast commented Sep 26, 2024

(maybe as an addition to #962)?

Instead of #962 IMO. CLIENT CAPA pubsub-ok. 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants