Skip to content

Commit

Permalink
nvme-rdma: unquiesce admin_q before destroy it
Browse files Browse the repository at this point in the history
Kernel will hang on destroy admin_q while we create ctrl failed, such
as following calltrace:

PID: 23644    TASK: ff2d52b40f439fc0  CPU: 2    COMMAND: "nvme"
 #0 [ff61d23de260fb78] __schedule at ffffffff8323bc15
 #1 [ff61d23de260fc08] schedule at ffffffff8323c014
 #2 [ff61d23de260fc28] blk_mq_freeze_queue_wait at ffffffff82a3dba1
 #3 [ff61d23de260fc78] blk_freeze_queue at ffffffff82a4113a
 #4 [ff61d23de260fc90] blk_cleanup_queue at ffffffff82a33006
 thesofproject#5 [ff61d23de260fcb0] nvme_rdma_destroy_admin_queue at ffffffffc12686ce
 thesofproject#6 [ff61d23de260fcc8] nvme_rdma_setup_ctrl at ffffffffc1268ced
 thesofproject#7 [ff61d23de260fd28] nvme_rdma_create_ctrl at ffffffffc126919b
 thesofproject#8 [ff61d23de260fd68] nvmf_dev_write at ffffffffc024f362
 thesofproject#9 [ff61d23de260fe38] vfs_write at ffffffff827d5f25
    RIP: 00007fda7891d574  RSP: 00007ffe2ef06958  RFLAGS: 00000202
    RAX: ffffffffffffffda  RBX: 000055e8122a4d90  RCX: 00007fda7891d574
    RDX: 000000000000012b  RSI: 000055e8122a4d90  RDI: 0000000000000004
    RBP: 00007ffe2ef079c0   R8: 000000000000012b   R9: 000055e8122a4d90
    R10: 0000000000000000  R11: 0000000000000202  R12: 0000000000000004
    R13: 000055e8122923c0  R14: 000000000000012b  R15: 00007fda78a54500
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

This due to we have quiesced admi_q before cancel requests, but forgot
to unquiesce before destroy it, as a result we fail to drain the
pending requests, and hang on blk_mq_freeze_queue_wait() forever. Here
try to reuse nvme_rdma_teardown_admin_queue() to fix this issue and
simplify the code.

Fixes: 958dc1d ("nvme-rdma: add clean action for failed reconnection")
Reported-by: Yingfu.zhou <yingfu.zhou@shopee.com>
Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com>
Signed-off-by: Yue.zhao <yue.zhao@shopee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
  • Loading branch information
Chunguang.xu authored and keithbusch committed Dec 4, 2024
1 parent fec55c2 commit 5858b68
Showing 1 changed file with 1 addition and 7 deletions.
8 changes: 1 addition & 7 deletions drivers/nvme/host/rdma.c
Original file line number Diff line number Diff line change
Expand Up @@ -1091,13 +1091,7 @@ static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
}
destroy_admin:
nvme_stop_keep_alive(&ctrl->ctrl);
nvme_quiesce_admin_queue(&ctrl->ctrl);
blk_sync_queue(ctrl->ctrl.admin_q);
nvme_rdma_stop_queue(&ctrl->queues[0]);
nvme_cancel_admin_tagset(&ctrl->ctrl);
if (new)
nvme_remove_admin_tag_set(&ctrl->ctrl);
nvme_rdma_destroy_admin_queue(ctrl);
nvme_rdma_teardown_admin_queue(ctrl, new);
return ret;
}

Expand Down

0 comments on commit 5858b68

Please sign in to comment.