Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rdma: Add support for running the rdma tests over hardware #86

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Kamalheib
Copy link

Signed-off-by: Kamal Heib kamalheib1@gmail.com

Copy link
Collaborator

@kawasaki kawasaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have RDMA hardware, but this change looks valuable for me. I suggest add some more description in the commit message why this change is good. Also, note about the use_hw_rdma in Documentation/running-test.md will be helpful.

tests/nvme/rc Outdated Show resolved Hide resolved
tests/nvmeof-mp/rc Outdated Show resolved Hide resolved
@kawasaki
Copy link
Collaborator

Overall, this change looks a good improvement for me. Richer commit message and explanation in Documentation/running-test.md will be required.

@bvanassche @sagigrimberg @ChaitanayaKulkarni
FYI, if you have any comment on this PR, please share with @Kamalheib.

Copy link
Collaborator

@kawasaki kawasaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I commented, richer commit message and explanation in Documentation/running-test.md will be helpful. Other than that, this change looks good to me. Other NVME experts may have some more comments.

@kawasaki
Copy link
Collaborator

Hi @Kamalheib thanks for updating the patch.

Could you share the result of test runs with this change and the real RDMA hardware? If blktests runs with real RDMA hardware, it is great :) Do you see any failure? As for the nvme test group, I'm curious which transport you use, and if you set def_traddr or not (Sagi mentioned that default address is local, so I'm not sure if the local address really works with the real hardware.)

Also I think the commit needs a couple of brush up: rdma hardware configuration check that Sagi mentioned and commit message. I maybe able to help them.

@Kamalheib
Copy link
Author

Hi @Kamalheib thanks for updating the patch.

Could you share the result of test runs with this change and the real RDMA hardware? If blktests runs with real RDMA hardware, it is great :) Do you see any failure? As for the nvme test group, I'm curious which transport you use, and if you set def_traddr or not (Sagi mentioned that default address is local, so I'm not sure if the local address really works with the real hardware.)

Also I think the commit needs a couple of brush up: rdma hardware configuration check that Sagi mentioned and commit message. I maybe able to help them.

Hi @kawasaki,

For proof of concept, I tested SRP only :-), but after I saw that you are interested in nvme, I tested nvme too, I was using ConnectX-6 Dx in RoCE mode, and I saw some failure with both tests (please see below), with regards the commit brush up, please let me know what needs to be changed and I will work on fixing them.

# lspci -d 15b3:
84:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
84:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
# use_hw_rdma=1 ./check srp/
srp/001 (Create and remove LUNs)                             [passed]
    runtime  1.959s  ...  1.979s
srp/002 (File I/O on top of multipath concurrently with logout and login (mq)) [passed]
    runtime  32.845s  ...  34.718s
srp/003 (File I/O on top of multipath concurrently with logout and login (sq)) [not run]
    legacy device mapper support is missing
srp/004 (File I/O on top of multipath concurrently with logout and login (sq-on-mq)) [not run]
    legacy device mapper support is missing
srp/005 (Direct I/O with large transfer sizes, cmd_sg_entries=255 and bs=4M) [passed]
    runtime  12.590s  ...  12.578s
srp/006 (Direct I/O with large transfer sizes, cmd_sg_entries=255 and bs=8M) [passed]
    runtime  12.555s  ...  12.558s
srp/007 (Direct I/O with large transfer sizes, cmd_sg_entries=1 and bs=4M) [passed]
    runtime  12.548s  ...  12.633s
srp/008 (Direct I/O with large transfer sizes, cmd_sg_entries=1 and bs=8M) [passed]
    runtime  12.638s  ...  12.656s
srp/009 (Buffered I/O with large transfer sizes, cmd_sg_entries=255 and bs=4M) [passed]
    runtime  12.588s  ...  12.600s
srp/010 (Buffered I/O with large transfer sizes, cmd_sg_entries=255 and bs=8M) [passed]
    runtime  12.718s  ...  12.671s
srp/011 (Block I/O on top of multipath concurrently with logout and login) [passed]
    runtime  32.888s  ...  32.891s
srp/012 (dm-mpath on top of multiple I/O schedulers)         [passed]
    runtime  5.608s  ...  5.654s
srp/013 (Direct I/O using a discontiguous buffer)            [passed]
    runtime  3.271s  ...  3.271s
srp/014 (Run sg_reset while I/O is ongoing)                  [passed]
    runtime  32.764s  ...  32.810s
srp/016 (RDMA hot-unplug)                                    [failed]
    runtime  2.596s  ...  2.183s
    --- tests/srp/016.out	2023-10-19 12:14:59.577898459 -0400
    +++ /home/blktests/results/nodev/srp/016.out.bad	2023-10-20 01:28:01.491595663 -0400
    @@ -1,2 +1,4 @@
     Configured SRP target driver
    +error: Invalid argument
    +error: Invalid argument
     Passed
# use_hw_rdma=1 nvme_trtype=rdma ./check nvme/
nvme/002 (create many subsystems and test discovery)         [not run]
    nvme_trtype=rdma is not supported in this test
nvme/003 (test if we're sending keep-alives to a discovery controller) [passed]
    runtime    ...  11.578s
nvme/004 (test nvme and nvmet UUID NS descriptors)           [passed]
    runtime    ...  7.417s
nvme/005 (reset local loopback target)                       [passed]
    runtime    ...  14.710s
nvme/006 (create an NVMeOF target with a block device-backed ns) [passed]
    runtime    ...  0.140s
nvme/007 (create an NVMeOF target with a file-backed ns)     [passed]
    runtime    ...  0.124s
nvme/008 (create an NVMeOF host with a block device-backed ns) [passed]
    runtime    ...  7.487s
nvme/009 (create an NVMeOF host with a file-backed ns)       [passed]
    runtime    ...  7.468s
nvme/010 (run data verification fio job on NVMeOF block device-backed ns) [passed]
    runtime    ...  306.410s
nvme/011 (run data verification fio job on NVMeOF file-backed ns) [passed]
    runtime    ...  854.449s
nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns) [passed]
    runtime    ...  313.464s
nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed ns) [passed]
    runtime    ...  772.565s
nvme/014 (flush a NVMeOF block device-backed ns)             [failed]
    runtime    ...  86.676s
    --- tests/nvme/014.out	2023-10-19 12:12:09.530592218 -0400
    +++ /home/blktests/results/nodev/nvme/014.out.bad	2023-10-20 02:18:42.715373001 -0400
    @@ -1,6 +1,6 @@
     Running nvme/014
     91fdba0d-f87b-4c25-b80f-db7be1418b9e
     uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
    -NVMe Flush: success
    +flush: Interrupted system call
     NQN:blktests-subsystem-1 disconnected 1 controller(s)
     Test complete
nvme/015 (unit test for NVMe flush for file backed ns)       [passed]
    runtime    ...  130.308s
nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery) [not run]
    nvme_trtype=rdma is not supported in this test
nvme/017 (create/delete many file-ns and test discovery)     [not run]
    nvme_trtype=rdma is not supported in this test
nvme/018 (unit test NVMe-oF out of range access on a file backend) [passed]
    runtime    ...  7.375s
nvme/019 (test NVMe DSM Discard command on NVMeOF block-device ns) [passed]
    runtime    ...  7.518s
nvme/020 (test NVMe DSM Discard command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.411s
nvme/021 (test NVMe list command on NVMeOF file-backed ns)   [passed]
    runtime    ...  7.507s
nvme/022 (test NVMe reset command on NVMeOF file-backed ns)  [passed]
    runtime    ...  14.762s
nvme/023 (test NVMe smart-log command on NVMeOF block-device ns) [passed]
    runtime    ...  7.511s
nvme/024 (test NVMe smart-log command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.505s
nvme/025 (test NVMe effects-log command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.557s
nvme/026 (test NVMe ns-descs command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.505s
nvme/027 (test NVMe ns-rescan command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.429s
nvme/028 (test NVMe list-subsys command on NVMeOF file-backed ns) [passed]
    runtime    ...  7.481s
nvme/029 (test userspace IO via nvme-cli read/write interface) [passed]
    runtime    ...  7.769s
nvme/030 (ensure the discovery generation counter is updated appropriately) [passed]
    runtime    ...  0.537s
nvme/031 (test deletion of NVMeOF controllers immediately after setup) [passed]
    runtime    ...  73.713s
nvme/038 (test deletion of NVMeOF subsystem without enabling) [passed]
    runtime    ...  0.047s
nvme/040 (test nvme fabrics controller reset/disconnect operation during I/O) [passed]
    runtime    ...  20.732s
nvme/041 (Create authenticated connections)                  [passed]
    runtime    ...  9.149s
nvme/042 (Test dhchap key types for authenticated connections) [passed]
    runtime    ...  55.175s
nvme/043 (Test hash and DH group variations for authenticated connections) [passed]
    runtime    ...  102.770s
nvme/044 (Test bi-directional authentication)                [passed]
    runtime    ...  18.596s
nvme/045 (Test re-authentication)                            [passed]
    runtime    ...  10.765s
nvme/047 (test different queue types for fabric transports)  [passed]
    runtime    ...  16.039s
nvme/048 (Test queue count changes on reconnect)             [failed]
    runtime    ...  14.379s
    --- tests/nvme/048.out	2023-10-19 12:14:59.575898491 -0400
    +++ /home/blktests/results/nodev/nvme/048.out.bad	2023-10-20 02:28:03.588253065 -0400
    @@ -1,3 +1,11 @@
     Running nvme/048
    -NQN:blktests-subsystem-1 disconnected 1 controller(s)
    +grep: /sys/class/nvme-fabrics/ctl//state: No such file or directory
    +grep: /sys/class/nvme-fabrics/ctl//state: No such file or directory
    +grep: /sys/class/nvme-fabrics/ctl//state: No such file or directory
    +grep: /sys/class/nvme-fabrics/ctl//state: No such file or directory
    +grep: /sys/class/nvme-fabrics/ctl//state: No such file or directory
    ...
    (Run 'diff -u tests/nvme/048.out /home/blktests/results/nodev/nvme/048.out.bad' to see the entire diff)

@kawasaki
Copy link
Collaborator

For proof of concept, I tested SRP only :-), but after I saw that you are interested in nvme, I tested nvme too, I was using ConnectX-6 Dx in RoCE mode, and I saw some failure with both tests (please see below)

Thanks! The test looks working :) It is good to see the failures. Sooner or later, they must be addressed. If they are kernel bugs, it's great we catch them. Or they might be test side bugs. I think this PR can be applied to blktests before resolving the failures. Before that, let's do some more brush up.

@kawasaki
Copy link
Collaborator

kawasaki commented Oct 23, 2023

with regards the commit brush up, please let me know what needs to be changed and I will work on fixing them.

Here I note the improvement points I found:

  1. When hardware rdma is not available and use_hw_rdma is set, some tests fail (of course!): e.g. nvme/003. And failure message does not tell that hardware rdma set up is required. This is confusing. I think we can add a check for exit status of "nvme connect" command in _nvme_connect_subsys() to suggest blktests users to check rdma hardware set up:
       if ! nvme connect "${ARGS[@]}" 2> /dev/null; then                                                                                                                                                                                                                                        
                  if [[ -n "$use_hw_rdma" ]]; then                                                                                                                                                                                                                                        
                          echo "Check RDMA hardware set up: use_hw_rdma is enabled and 'nvme connect' failed."
                  fi                                                                                                                                                                                                                                         
          fi 
  1. Sagi mentioned that "btw currently the code falls-back to lo address, which is not something that will ever be set on an rdma nic...". Reading this, I guessed that nvme-rdma test would not work with real hardware when def_traddr is not set properly. However, your test runs are working without it. So now I guess it is fine to use the def_traddr 127.0.0.1 to teset with real hardware. Could you share kernel message log during nvme/003 test case run? I expect that the log will show me that the real RDMA hardware works with def_traddr. If so, there is no need to address the comment by Sagi.

  2. Documentation/running-tests.md needs describe the new config option, like:

These tests will use the siw (soft-iWARP) driver by default. The rdma_rxe
(soft-RoCE) driver and hardware RDMA drivers are also supported.
...                                                                                                                                                                                                                                                                                                                       
To use hardware RDMA drivers, set up hardware RDMA beforehand:                                                                                                                                                                                                                                                            
use_hw_rdma=1 nvme_trtype=rdma ./check nvme/                                                                                                                                                                                                                                                                              
use_hw_rdma=1 ./check srp/
The variables use_rxe and use_hw_rdma must not be enable at the same time.                                                                                                                                                                                                                                                

Also, it would be the better to check that both use_rxe and use_hw_rdma are not. Probably in start_rdma()?

  1. Commit message will need some more description, like:
It is desired to run nvme-rdma and SRP tests with real RDMA hardware so that 
we can confirm there is no issue in the more ralistic RDMA system set up. Add                                                                                                                                                                                                                                             the config variable "use_hw_rdma" to support it.

I hope these will help. If my action is needed, please let me know :)

The blktests are using soft-RoCE (rdma_rxe) and soft-iWARP (siw) to run
RDMA related tests, this change add support for run nvme-rdma and SRP
tests with real RDMA hardware, this is needed to make sure that we don't
have issues when using real RDMA hardware.

Signed-off-by: Kamal Heib <kheib@redhat.com>
@Kamalheib
Copy link
Author

Hi @kawasaki,

with regards the commit brush up, please let me know what needs to be changed and I will work on fixing them.

Here I note the improvement points I found:

1. When hardware rdma is not available and use_hw_rdma is set, some tests fail (of course!): e.g. nvme/003. And failure message does not tell that hardware rdma set up is required. This is confusing. I think we can add a check for exit status of "nvme connect" command in _nvme_connect_subsys() to suggest blktests users to check rdma hardware set up:
       if ! nvme connect "${ARGS[@]}" 2> /dev/null; then                                                                                                                                                                                                                                        
                  if [[ -n "$use_hw_rdma" ]]; then                                                                                                                                                                                                                                        
                          echo "Check RDMA hardware set up: use_hw_rdma is enabled and 'nvme connect' failed."
                  fi                                                                                                                                                                                                                                         
          fi 

Done, please review the changes.

2. Sagi mentioned that "btw currently the code falls-back to lo address, which is not something that will ever be set on an rdma nic...". Reading this, I guessed that nvme-rdma test would not work with real hardware when def_traddr is not set properly. However, your test runs are working without it. So now I guess it is fine to use the def_traddr 127.0.0.1 to teset with real hardware. Could you share kernel message log during nvme/003 test case run? I expect that the log will show me that the real RDMA hardware works with def_traddr. If so, there is no need to address the comment by Sagi.

Seems like the tests are using the interface IP address "1.1.1.5" to do loopback:

[339197.231226] run blktests nvme/003 at 2023-10-23 10:21:28
[339197.286830] loop0: detected capacity change from 0 to 2097152
[339197.299931] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[339197.315478] nvmet_rdma: enabling port 0 (1.1.1.5:4420)
[339197.391014] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[339197.407846] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 1.1.1.5:4420
[339208.637374] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"

3. Documentation/running-tests.md needs describe the new config option, like:
These tests will use the siw (soft-iWARP) driver by default. The rdma_rxe
(soft-RoCE) driver and hardware RDMA drivers are also supported.
...                                                                                                                                                                                                                                                                                                                       
To use hardware RDMA drivers, set up hardware RDMA beforehand:                                                                                                                                                                                                                                                            
use_hw_rdma=1 nvme_trtype=rdma ./check nvme/                                                                                                                                                                                                                                                                              
use_hw_rdma=1 ./check srp/
The variables use_rxe and use_hw_rdma must not be enable at the same time.                                                                                                                                                                                                                                                

Also, it would be the better to check that both use_rxe and use_hw_rdma are not. Probably in start_rdma()?

Done, please review the changes.

4. Commit message will need some more description, like:
It is desired to run nvme-rdma and SRP tests with real RDMA hardware so that 
we can confirm there is no issue in the more ralistic RDMA system set up. Add                                                                                                                                                                                                                                             the config variable "use_hw_rdma" to support it.

Done, please review the changes.

I hope these will help. If my action is needed, please let me know :)

Thank you very much for your feedback.

@Kamalheib Kamalheib requested a review from kawasaki October 23, 2023 14:29
@kawasaki
Copy link
Collaborator

Seems like the tests are using the interface IP address "1.1.1.5" to do loopback:

[339197.231226] run blktests nvme/003 at 2023-10-23 10:21:28 [339197.286830] loop0: detected capacity change from 0 to 2097152 [339197.299931] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [339197.315478] nvmet_rdma: enabling port 0 (1.1.1.5:4420) [339197.391014] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [339197.407846] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 1.1.1.5:4420 [339208.637374] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"

Thanks for sharing the kernel message. I wonder where the address 1.1.1.5 comes from. Is it set up for the real hardware RDMA driver? In _create_nvmet_port, the def_traddr 127.0.0.1 is set in /sys/kernel/config/nvmet/ports/${port}/addr_traddr. If the hardware driver does not refer it, it looks ok to see the address 1.1.1.5.

@kawasaki
Copy link
Collaborator

Also, it would be the better to check that both use_rxe and use_hw_rdma are not. Probably in start_rdma()?

Your change looks working. I ran enabling both use_rxe and use_hw_rdma, and saw the message is printed in the failure log. But I rethought, and now I think it should be done in _nvme_requires(), because if both use_rxe and use_hw_rdma are set, test cases are skipped.

diff --git a/tests/nvme/rc b/tests/nvme/rc
index e22a73c..9ad3006 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -49,6 +49,9 @@ _nvme_requires() {
                elif [ -z "$use_hw_rdma" ]; then
                        _have_driver siw
                fi
+               if [[ -n "$use_rxe" && -n "$use_hw_rdma" ]]; then
+                       SKIP_REASONS+=("use_rxe and use_hw_rdma are both set")
+               fi
                ;;
        fc)
                _have_driver nvme-fc

Sorry for changing my mind. If you don't mind, I can make this change.

@kawasaki
Copy link
Collaborator

Thanks @Kamalheib for updating the patch. Other than my comment above, your change looks good to me. As the next step, I posted a notice in relevant mailing lists to get attention by key kernel developers to this PR. Let's see feedbacks from them.

@hreinecke
Copy link
Contributor

When using 'real' hardware you typically don't get to control the target; one calls 'nvme connect' and then has to live with whatever controller is configured on the other end.
Which means we will need to have two distinct functionalities:
a) make blktest run against a pre-configured controller:

  • allow to specify the controller/namespace via nqn and NSID
  • disable creation of internal targets
    b) enable other transports (rdma, RoCE, FC) to run on real hw
    This patch is just attempting b), but we really need to implement a) first for this to become useful.

@kawasaki
Copy link
Collaborator

a) make blktest run against a pre-configured controller:

* allow to specify the controller/namespace via nqn and NSID

* disable creation of internal targets
  b) enable other transports (rdma, RoCE, FC) to run on real hw
  This patch is just attempting b), but we really need to implement a) first for this to become useful.

@hreinecke Thanks for the comment.

@Kamalheib What do you think? Your approach might be good for your use case, but it may not be good enough for other users.

I think the comment by Hannes is covering the nvme group. Similar discussion might be valid for srp. I wonder which test group @Kamalheib is interested in: srp? nvme RDMA? Or both?

FYI, here's the link to the discussion on the list.

@kawasaki
Copy link
Collaborator

I suggest to hold merging the commits of PR at this moment.

As for nvme-rdma, @hreinecke suggested different approach and he opened PR #127 for it. Let's see how the new PR will go.

As for srp, the original author @bvanassche provided a comment. He pointed out that srp test group was not designed to run with real RDMA hardware, and it may beyond the scope of blktests and increase the burden of maintenance of blktests.

Regarding the scope and burden discussion, I do not yet have clear answer. I think it depends on if the blktests run with real RDMA hardware can find out kernel issue or not. @Kamalheib reported a few failures with real RDMA hardware, but they do not look showing kernel issues.

If @Kamalheib or anyone has experience of kernel issues found with real RDMA hardware, we can revisit this PR.

@Kamalheib
Copy link
Author

Seems like the tests are using the interface IP address "1.1.1.5" to do loopback:
[339197.231226] run blktests nvme/003 at 2023-10-23 10:21:28 [339197.286830] loop0: detected capacity change from 0 to 2097152 [339197.299931] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [339197.315478] nvmet_rdma: enabling port 0 (1.1.1.5:4420) [339197.391014] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [339197.407846] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 1.1.1.5:4420 [339208.637374] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"

Thanks for sharing the kernel message. I wonder where the address 1.1.1.5 comes from. Is it set up for the real hardware RDMA driver? In _create_nvmet_port, the def_traddr 127.0.0.1 is set in /sys/kernel/config/nvmet/ports/${port}/addr_traddr. If the hardware driver does not refer it, it looks ok to see the address 1.1.1.5.

The 1.1.1.5 is the IP address that I've configured for the network interface.

@Kamalheib
Copy link
Author

I suggest to hold merging the commits of PR at this moment.

As for nvme-rdma, @hreinecke suggested different approach and he opened PR #127 for it. Let's see how the new PR will go.

As for srp, the original author @bvanassche provided a comment. He pointed out that srp test group was not designed to run with real RDMA hardware, and it may beyond the scope of blktests and increase the burden of maintenance of blktests.

Regarding the scope and burden discussion, I do not yet have clear answer. I think it depends on if the blktests run with real RDMA hardware can find out kernel issue or not. @Kamalheib reported a few failures with real RDMA hardware, but they do not look showing kernel issues.

If @Kamalheib or anyone has experience of kernel issues found with real RDMA hardware, we can revisit this PR.

@kawasaki The idea behind the PR is the following:

1- Avoid the instability of soft-roce and soft-iwarp (soft-iwarp is more stable now).
2- Get the chance to find issues in the real hardware drivers.
3- Be able to run the blktests over the available RDMA hardware in the RDMA cluster that we have here are Red Hat.

I totally understand the concerns and the feedback from the community, but most of the time people will use real hardware to leverage the features of RDMA.

Thanks,
Kamal

@bvanassche
Copy link
Contributor

1- Avoid the instability of soft-roce and soft-iwarp (soft-iwarp is more stable now).

I haven't encountered any stability issues with the soft-iWARP driver in the past few years. Did I perhaps miss something?

2- Get the chance to find issues in the real hardware drivers.
3- Be able to run the blktests over the available RDMA hardware in the RDMA cluster that we have here are Red Hat.
I totally understand the concerns and the feedback from the community, but most of the time people will use real hardware to leverage the features of RDMA.

Let me repeat that testing RDMA adapters is out of scope for the SRP tests. The SRP tests have not been designed to test RDMA adapters. With the proposed changes the SRP tests only test a small subset of RDMA driver and adapter functionality, namely loopback support. These tests do not trigger any communication over the network ports of RDMA adapters.

@kawasaki
Copy link
Collaborator

kawasaki commented Nov 1, 2023

2- Get the chance to find issues in the real hardware drivers.
3- Be able to run the blktests over the available RDMA hardware in the RDMA cluster that we have here are Red Hat.
I totally understand the concerns and the feedback from the community, but most of the time people will use real hardware to leverage the features of RDMA.

Let me repeat that testing RDMA adapters is out of scope for the SRP tests. The SRP tests have not been designed to test RDMA adapters. With the proposed changes the SRP tests only test a small subset of RDMA driver and adapter functionality, namely loopback support. These tests do not trigger any communication over the network ports of RDMA adapters.

From my point of view, blktests is the framework "for the Linux kernel block layer and storage stack", so RDMA adapter is out of blktests scope. On the other hand RDMA driver could fall in the scope (Bart put it out of the design scope, though). Next question is this: is blktests SRP group is the best test framework to test RDMA drivers? Maybe no, since it covers only loopback part of the drivers as Bart says. We need other tests to cover RDMA driver code thoroughly.

Having said that, I still think blktests srp test group can work as an "integration test" to cover the integration between RDMA drivers and SRP driver. It does extend hw driver coverage. Even though the extended coverage size is small, I guess it still have chance to find any driver issues.

It does not sound that Kamal found such issues yet. So we are guessing the future. Will it find driver issues or not? (Which side will you bet? :) Bart is taking the SRP test group maintenance cost, and I know Bart's time is precious. So maybe I or Kamal should take the additional cost. One idea is that I or Kamal keep a separated branch for some months so that Kamal can run the SRP tests with real RDMA adapters. If it catches any issue, we can prove the value to merge the code to the master. Otherwise, maybe the integration part is stable enough. What do you think Kamal?

(BTW, I'll take one week vacation from tomorrow, so my response will be slow)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants