-
Notifications
You must be signed in to change notification settings - Fork 653
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
RDMA: Support user keepalive command (#916)
If the client side crashes by any issue or exits normally, the kernel will try to disconnect RDMA QPs. Then the kernel of server side receives CM packets, valkey-server handles CM disconnected event and close connection. However, there is a lack of keepalive mechanism from RDMA transport layer. Once the kernel of client side crashes, the server side will not be notified. To avoid this issue, valkey server sents Keepaliv command periodically to detect any dead QPs. An example of mlx-cx5: ``` # RDMA: CQ handle error status: transport retry counter exceeded[0xc], opcode : 0x0 # RDMA: CQ handle error status: transport retry counter exceeded[0xc], opcode : 0x0 # RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0 # RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0 # RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0 # RDMA: CQ handle error status: Work Request Flushed Error[0x5], opcode : 0x0 ``` Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
- Loading branch information
Showing
1 changed file
with
38 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters