Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to virtio-vDPA devices #2664

Merged
merged 3 commits into from
Jan 13, 2023
Merged

Add support to virtio-vDPA devices #2664

merged 3 commits into from
Jan 13, 2023

Conversation

amorenoz
Copy link
Contributor

- What this PR does and why is it needed

This PR works in combination with this PR in the SR-IOV Network Device Plugin to provide support for vDPA devices.

This functionality is based on OvS Hardware Offload and allows ovn-kubernetes to expose an open standard netdev such as a virtio-net to the pod.

Most of the heavy-lifting (detecting vdpa devices, ensuring they are bound the right driver, etc) is done by the SR-IOV Network Device Plugin. From a ovn-kubernetes perspective, it's just a matter of selecting the right netdev to move to the pod's namespace.

- Special notes for reviewers

I would like feedback on a number of topics:

  • Right now, this PR just detects if the PCI device has a virtio-vdpa device using govdpa and exposes the virtio-net netdev if so. It is my understanding that supported cards, even if they allow the creation of two simultaneous netdevs (the vendor one and the virtio-vdpa device), the vdpa netdev takes precedence if it exists. Therefore, this logic would be sane.

If that is not the case we would need to add support to the Device Information Spec so we can read the information coming from the Device Plugin and unambiguously determine the type of device we're handling

  • I've placed the code that returns the virtio-net netdev in pkg/util/sriovnet_linux.go. It might not be the perfect place as it seems more of a wrapper around https://github.com/Mellanox/sriovnet. So this is a question for @moshe010 and @adrianchiris: WDYT? Would you accept this logic into https://github.com/Mellanox/sriovnet?

- How to verify it

You need a working OvS Hardware Offload setup with a ConnectX6-DX NIC. Then, follow the instructions that are added in docs/ovs_offload.md as part of this PR to configure virtio-vDPA devices.

- Description for the changelog

Add support for virtio-vdpa devices

@amorenoz
Copy link
Contributor Author

/cc @zshi-redhat

@adrianchiris
Copy link
Contributor

ve placed the code that returns the virtio-net netdev in pkg/util/sriovnet_linux.go. It might not be the perfect place as it seems more of a wrapper around https://github.com/Mellanox/sriovnet. So this is a question for @moshe010 and @adrianchiris: WDYT? Would you accept this logic into https://github.com/Mellanox/sriovnet?

i think it makes sense. maybe govdpa should go to npwg.

@amorenoz
Copy link
Contributor Author

ve placed the code that returns the virtio-net netdev in pkg/util/sriovnet_linux.go. It might not be the perfect place as it seems more of a wrapper around https://github.com/Mellanox/sriovnet. So this is a question for @moshe010 and @adrianchiris: WDYT? Would you accept this logic into https://github.com/Mellanox/sriovnet?

i think it makes sense. maybe govdpa should go to npwg.

Agree, wasn't there an initiative to move some common stuff to a library in npwg that could be used by more projects?
I can propose govdpa to npwg in the next meeting

Should sriovnet also go to npwg too?
In any case I'll send a PR to add this part of vdpa code into sriovnet

@coveralls
Copy link

coveralls commented Nov 27, 2021

Coverage Status

Coverage decreased (-0.09%) to 53.176% when pulling 4b108c3 on amorenoz:vdpa into 1a2bdaf on ovn-org:master.

@amorenoz amorenoz changed the title RFC: Add support to virtio-vDPA devices Add support to virtio-vDPA devices Dec 14, 2021
@amorenoz
Copy link
Contributor Author

Hi @adrianchiris , as we discussed, having sriovnet depend on govdpa does not make much sense since the goal of vdpa is to (eventually) work with all possible vdpa parents (sriov is just one of them).

I've moved the govdpa-related stuff to an independent vdpa_linux.go wrapper so it can be mocked and added unit tests. PTAL

@amorenoz
Copy link
Contributor Author

Rebased on top of @girishmg 's latest change: 60eeb3a

@amorenoz
Copy link
Contributor Author

amorenoz commented Apr 7, 2022

@girishmg @dcbw @danwinship any comments on this?

@dcbw
Copy link
Contributor

dcbw commented Jul 12, 2022

@amorenoz could you rebase the PR? Looks OK to me, just needs the rebase.

@lmilleri
Copy link
Contributor

@dcbw I rebased and pushed the PR again, but got an error in the ovn-ci / e2e-dual-conversion. Hope this is an expected failure and not due to my changes?

@zshi-redhat
Copy link
Contributor

/retest

@ovn-robot
Copy link
Collaborator

Oops, something went wrong:

Unable to retry this workflow run because it was created over a month ago

@zshi-redhat
Copy link
Contributor

Oops, something went wrong:

Unable to retry this workflow run because it was created over a month ago

Interesting, retest doesn't work for long living PR, @dcbw do you know how to rerun the test w/o creating a new PR (the CI failure doesn't seem to be related to the vdpa change)?

@@ -497,6 +497,8 @@ github.com/juju/utils v0.0.0-20180808125547-9dfc6dbfb02b/go.mod h1:6/KLg8Wz/y2KV
github.com/juju/version v0.0.0-20161031051906-1f41e27e54f2/go.mod h1:kE8gK5X0CImdr7qpSKl3xB2PmpySSmfj7zVbkZFs81U=
github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w=
github.com/julienschmidt/httprouter v1.3.0/go.mod h1:JR6WtHb+2LUe8TCKY3cZOxFyyO8IZAc4RVcycCCAKdM=
github.com/k8snetworkplumbingwg/govdpa v0.1.3 h1:FZRhTMB1e3yWwSEy+l4eS73WioyMaL+vmFZ8JNwn+Uk=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmilleri I vaguely remembered that you mentioned we need latest changes in govdpa lib, but this is still using 0.1.3 version, is this expected?

@trozet
Copy link
Contributor

trozet commented Sep 23, 2022

Oops, something went wrong:

Unable to retry this workflow run because it was created over a month ago

Interesting, retest doesn't work for long living PR, @dcbw do you know how to rerun the test w/o creating a new PR (the CI failure doesn't seem to be related to the vdpa change)?

@zshi-redhat I can restart it, but i think it wont restart now because there are conflicting files. If this PR is still important can you rebase @amorenoz and we will review.

@zshi-redhat
Copy link
Contributor

@lmilleri could you please rebase?

@lmilleri
Copy link
Contributor

@zshi-redhat there are some changes in the govdpa library under review: k8snetworkplumbingwg/govdpa#15.
I think it would make sense to push the above PR first, create a new version and then rebase this PR afterwards

Add support to sriov-based virtio-vdpa devices.

When a sr-iov device has a vdpa device created, and it is bound to
virtio_vdpa driver, use the associated virtio-net device instead of the
vendor one.

In order to determine whether a device is vdpa or not as well as its
device path, use external
library github.com/k8snetworkplumbingwg/govdpa

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
vdpa is an addition on top of ovs_offload so it makes sense to document
it as part of it.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
@lmilleri
Copy link
Contributor

lmilleri commented Nov 4, 2022

Rebased from master and updated dependency to the newly govdpa v0.1.4 release.
Please review

@lmilleri
Copy link
Contributor

lmilleri commented Nov 8, 2022

Please consider this PR on hold, I found a potential issue in govdpa v0.1.4

Signed-off-by: Leonardo Milleri <lmilleri@redhat.com>
@lmilleri
Copy link
Contributor

Fixed integration issue with govdpa v.0.1.4.
Please review.

@zshi-redhat @dcbw

@zshi-redhat
Copy link
Contributor

/retest

@ovn-robot
Copy link
Collaborator

Oops, something went wrong:

Unable to retry this workflow run because it was created over a month ago

@lmilleri
Copy link
Contributor

lmilleri commented Dec 2, 2022

@trozet @dcbw do you think this PR can be merged?
If not, please advise on the way forward.
Apparently "/retest" is not working, any workaround?

@amorenoz
Copy link
Contributor Author

@trozet @zshi-redhat @dcbw Can you please take a look at this PR?

@dcbw
Copy link
Contributor

dcbw commented Jan 4, 2023

/lgtm

mostly just helper code

@lmilleri
Copy link
Contributor

lmilleri commented Jan 9, 2023

Yes, it is mostly helper code apart from go-controller/pkg/cni/cni.go, in which resides the logic for returning the virtio/vdpa device if vdpa device exists.

@lmilleri
Copy link
Contributor

@girishmg can you please review it?

@trozet trozet merged commit 57bcf26 into ovn-org:master Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants