Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding rdma support documentation for intel vf's #482

Merged
merged 2 commits into from
May 22, 2023

Conversation

Eoghan1232
Copy link
Collaborator

published docker image available:
RHEL: https://hub.docker.com/r/intel/rdma_rhel_intel

signed off: eoghan.russell@intel.com

published docker image available:
RHEL: https://hub.docker.com/r/intel/rdma_rhel_intel

Signed-off-by: Eoghan Russell <eoghan.russell@intel.com>
@coveralls
Copy link
Collaborator

coveralls commented May 12, 2023

Pull Request Test Coverage Report for Build 5001228871

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 78.339%

Totals Coverage Status
Change from base Build 4925778559: 0.0%
Covered Lines: 1924
Relevant Lines: 2456

💛 - Coveralls

@SchSeba
Copy link
Collaborator

SchSeba commented May 16, 2023

Hi @Eoghan1232,

thanks for the PR!
question do we need any changes in the device plugin code?
can you please run env inside the container I will like to see the mounts we add for the intel RDMA.
can we use the device in that mode for DPDK applications or still the VF needs to be in VFIO driver to work with dpdk?

last question what is the use-case for RDMA with intel? for mlx we need this for DPDK for example.

@Eoghan1232
Copy link
Collaborator Author

Hi @Eoghan1232,

thanks for the PR! question do we need any changes in the device plugin code? can you please run env inside the container I will like to see the mounts we add for the intel RDMA. can we use the device in that mode for DPDK applications or still the VF needs to be in VFIO driver to work with dpdk?

last question what is the use-case for RDMA with intel? for mlx we need this for DPDK for example.

Hi @SchSeba !
no code changes are needed to the device plugin, it works as is.
sure, here is the output to env :

root@testpod1:/# env
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
HOSTNAME=testpod1
PWD=/
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE=0000:5e:01.1
HOME=/root
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
TERM=xterm
SHLVL=1
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/usr/bin/env

This is for enablement for use cases such as kernel based networking (as opposed to user space/DPDK).
the device would still need to be bound to VFIO in order to use DPDK.

The use case is consistent enablement of hardware offload features available on Intel Network Adapters

@SchSeba
Copy link
Collaborator

SchSeba commented May 16, 2023

this is weird what version of the device plugin you use?
I don't see the new INFO environment with all the mounts

@Eoghan1232
Copy link
Collaborator Author

this is weird what version of the device plugin you use? I don't see the new INFO environment with all the mounts

I am currently running an older version of DP on my system, not the latest.
I can redeploy DP as latest and get the output of env for you.

@SchSeba
Copy link
Collaborator

SchSeba commented May 16, 2023

that will be great I will like to see the RDMA mount points that we mount into the pod

@Eoghan1232
Copy link
Collaborator Author

@SchSeba

root@testpod2:/# env
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
HOSTNAME=testpod2
PWD=/
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE=0000:5e:01.1
HOME=/root
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
TERM=xterm
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE_INFO={"0000:5e:01.1":{"generic":{"deviceID":"0000:5e:01.1"},"rdma":{"rdma_cm":"/dev/infiniband/rdma_cm","uverbs":"/dev/infiniband/uverbs3"}}}
SHLVL=1
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/usr/bin/env


## RDMA modules:
* Mellanox ConnectX®-4 Lx, ConnectX®-5 Adapters mlx5_core or mlx5_ib
* Intel E810-C Adapter ice and iavf

## Privileges
IPC_LOCK capability privilege is required for RMA application to function properly in Kubernetes Pod.
Copy link
Contributor

@adrianchiris adrianchiris May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under Rdma Mounts sectrion, can you add a note:

Note: rdma character devices mounted under /dev/infiniband may vary depending on the vendor and loaded kernel modules.

and while at it, if you can also remove the line:

The digit after the file name is the index of the VF

it would be great, as these numbers are just a "char device index" which gets incremented as new devices added, its not correlated to VF inde

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the feedback @adrianchiris , I've addressed this now.

Copy link
Contributor

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of nits otherwise LGTM !

@SchSeba SchSeba merged commit 82a61f3 into k8snetworkplumbingwg:master May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants