-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added support for multiple resource endpoints #26
Conversation
@zshi-redhat @tomo @dougbtv @fepan Please review the multiple endpoint features. It is really good implementation by @ahalim-intel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, Abdul!
Adding few comments inline by reviewing the code, may come back and add more after testing.
images/sriovdp-daemonset.yaml
Outdated
readOnly: true | ||
- name: log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod specs was changed during testing. Will add back
return | ||
} | ||
default: | ||
_, err := os.Lstat(sockPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we consider moving to kubelet plugin watcher mode instead of watching the sriov endpoint file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely! This will reduce the burden of watching kubelet. Let's see if we can merge your kubelet plugin watcher PR with this implementation. This was implemented long before Kubelet plugin watcher support.
pkg/resources/pool.go
Outdated
type resourcePool struct { | ||
config *types.ResourceConfig | ||
devices map[string]*pluginapi.Device | ||
bindScript string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this bindScript will be used? I didn't find an example for existing pools, such as vfio, uio, generic, netDevice; is it for binding interface with userspace drivers like dpdk_bind script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I was thinking of do the driver binding unbinding using the dpdk-devbind.py script if that was needed. But this may add additional complexities and issues. It's better to leave the driver binding part outside of the scope of device plugin. This way device plugin can stay as simple as possible. Left the binding code here should we need in the future. Or else can be removed.
return newVfioResourcePool(rc) | ||
case "uio": | ||
return newUioResourcePool(rc) | ||
case "netdevice": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the name 'netdevice' reminds me of ovs 'netdev' datapath_type, but it's intended for the kernel net devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you suggest any alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps ethdevice or kerneldevice.
func (rf *resourceFactory) GetResourcePool(rc *types.ResourceConfig) types.ResourcePool { | ||
glog.Infof("Resource pool type: %s", rc.DeviceType) | ||
switch rc.DeviceType { | ||
case "vfio": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 'vfio' or 'uio' DeviceType be specified in config file, do we assume that the dpdk port has already been binded to userspace driver before launching SR-IOV device plugin? Does it imply that SR-IOV CNI will not support binding of dpdk driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 'vfio' or 'uio' DeviceType be specified in config file, do we assume that the dpdk port has already been binded to userspace driver before launching SR-IOV device plugin? Does it imply that SR-IOV CNI will not support binding of dpdk driver?
That is correct. Admin binds VFs to the desired drivers(kernel network driver, igb_uio or vfio-pci) then define the resource config accordingly for device plugin to export correct device files. If a VF is bound with userspace driver then SRIOV-CNI has no job to do(So we could just skip it for now). It is up to the application configure the port.
|
||
func (rs *resourceServer) triggerUpdate() { | ||
rp := rs.resourcPool | ||
if rs.checkIntervals > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to capture the notification of PF link state change and then triggers the update accordingly? With checkIntervals, there might be some short period when link state is changed, but kubelet doesn't get updated state and still allocated unhealthy devices to pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a file watcher that watches the operstate? Lets start an issue on this later once this PR is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, sth like a watcher or a block wait to detect notification immediately when state changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, please submit a PR with an implementation if you can, That would be great. Thanks
Thanks for the review @zshi-redhat New changes on the way! |
4902f09
to
b592721
Compare
README.md
Outdated
This plugin was tested in following environment: | ||
- CentOS 7 - Kernel 4.13.7-1.el7.elrepo.x86_64 | ||
- Go - version go1.10.3 linux/amd64 | ||
- Kubernetes v1.12.0-alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there special consideration on which kube release we will support or be compatible with going forward?
for some feature like kubelet plugin watcher which is Beta in 1.12; switch to it in DP might cause incompatibility with previous kube releases or some extra configuration(enable feature gates) in order to work with previous releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's depends on K8s device plugin features support really, our plugin should work with any k8s release that supports device plugin(without any dramatic changes of course). The Kubelet plugin watcher is backward compatible still. Later we should definitely switch to the plugin watcher mechanism.
Here we simply mentioning the versions we have tested with this implementation.
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
a8bd563
to
8326564
Compare
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
This changes will add following new features to sriov-device-pLlugin - Handles SRIOV capable/not-capable devices (NICs and Accelerators alike) - Supports devices with both Kernel and userspace(uio and VFIO) drivers - Supports PF bound to DPDK driver to meet certain use-cases - Allow grouping together multiple PCI devices as one aggregated resource pool - Can represent each PF as a separately addressable resource pool to K8s - User configurable resourceName - Detects Kubelet restarts and auto-re-register - Detects Link status (for Linux network devices) and updates associated VFs health accordingly - Extensible to support new device types with minimal effort if not already supported Change-Id: I314e1ea75e647e747bcfdb2f447e593adbcd14c5
Change-Id: I802e3d8bf2893f4975483d68da459f49c9f7bb80
Change-Id: I46fc648f1b400bb55f207d5a24cf91571dbd1a87
updated device were appended instead of a new list. This is corrected. Change-Id: Ia10c53209872cd691d4c83fe8cbfdc4e91650693
fbf9b83
to
6b5d1bc
Compare
for _, container := range rqt.ContainerRequests { | ||
containerResp := new(pluginapi.ContainerAllocateResponse) | ||
containerResp.Devices = rs.resourcePool.GetDeviceSpecs(rs.resourcePool.GetDeviceFiles(), container.DevicesIDs) | ||
containerResp.Envs = rs.resourcePool.GetEnvs(rs.resourcePool.GetResourceName(), container.DevicesIDs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the Env var has hard-coded name 'SRIOV-VF-PCI-ADDR' which can be used by application in Pod to get information about which DeviceIDs have been allocated; With above change, the application doesn't know which name of Env shall be queried to get the DeviceIDs since the resourceName is unknown to the Pod. although Pod can try to get the resourceName by querying the network CRD attached with it, but I don't think this is a good option to go as it basically requires individual pod to access system network CRD., plus the resourceName(sriov) in Env only contains the suffix of actual full resourceName(inte.com/sriov).
Perhaps we can keep the hardcoded Env name as it is and make the resourceName+DeviceID a dict value of Env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the network attachment crd is known to the Pod then the associated CRD is also attainable by looking at the resourceName in it. Maybe it's something for a discussion.
Perhaps we can keep the hardcoded Env name as it is and make the resourceName+DeviceID a dict value of Env.
Can you give an example how it should look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the network attachment crd is known to the Pod then the associated CRD is also attainable by looking at the resourceName in it. Maybe it's something for a discussion.
The resourceName returned in "containerResp.Envs" only contains the suffix (e.g sriov), in multiple endpoints mode, how does Pod get to know which prefix is used(assuming multiple prefix are advertised)?
Perhaps we can keep the hardcoded Env name as it is and make the resourceName+DeviceID a dict value of Env.
Can you give an example how it should look like?
for example:
Env['SRIOV-VF-PCI-ADDR'] = [{"resourceName": "intel.com/sriov", "deviceIDs": "00:01.0 00:01.1"}, {"resourceName": "example.com/sriov", "deviceIDs": "00:02.0 00:02.1"}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess good documentation and conventions we need to follow in such case.
for example:
Env['SRIOV-VF-PCI-ADDR'] = [{"resourceName": "intel.com/sriov", "deviceIDs": "00:01.0 00:01.1"}, {"resourceName": "example.com/sriov", "deviceIDs": "00:02.0 00:02.1"}]
It's good to have a known environment variable but now we running into another problem. This env variable will be overwritten by the most recent device allocate request. Consider this, a Pod with have two VFs requests from two different resource pool; sriovA and sriovB.
Say, sriovA allocate requests came in first. So the ENV variable is set to:
SRIOV-VF-PCI-ADDR= [{"resourceName": "intel.com/sriovA", "deviceIDs": "00:01.0"}]
Then the sriovB request came in and we expect to see:
SRIOV-VF-PCI-ADDR= [{"resourceName": "intel.com/sriovA", "deviceIDs": "00:01.0"},{"resourceName": "intel.com/sriovB", "deviceIDs": "02:00.1"}]
Right?
But what we will actually get:
SRIOV-VF-PCI-ADDR= [{"resourceName": "intel.com/sriovB", "deviceIDs": "02:00.1"}]
It's because there is no way of knowing if an ENV var already has a value or not. So, when we set it we simply setting a new value. That's why it makes sense to have the env variable name unique for the resourceName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahalim-intel you're right, I didn't think of that overwrite problem. Then the question remains, how the application running in the Pod can get the PCI address allocated for the Pod. This is a real question from kubevirt community where they use current env['SRIOV-VF-PCI-ADDR'] to get PCI address and passthough that to VM.
I'm not sure if it's a good design for Pod application to query API and parse the network CRD in order to retrieve the resourceName information, it sounds too much overhead for the application. can you think of a better way for application to get the resourceName?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open for better alternatives
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahalim-intel the very least, can we use full resource names (with intel.com/ prefix) for the variable names? This will at least help us to not assume the prefix is intel.com/, or that it is present and need to be dealt with at all. It will also allow other DPs to mimic the Intel DP behavior if they need to, without modifications to pod apps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we can do on kubevirt side is to precalculate interface-to-resourceName mapping when creating a pod and pass it off band (environment variable with a well known name?) into the pod. The pod then will extract the mapping and use it to decide which SR-IOV DP managed enviroment variable to use to extract a PCI ID for each VM interface.
We may drop the intel.com/ prefix to construct the PCI ID variable name ourselves, but it would be slightly easier for us (and more compatible with any other solutions that may pop up for device management) if we keep the full resource name for the variables. I understand that environment variables can't have slashes, so perhaps we can replace them with a different character (underscore?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, Multus could pass the network-to-resourceName mapping into the pod (todo: check if CNI API supports it). Then we would have the mapping defined in all pods, not just kubevirt virt-launcher pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at CNI API description today, specifically at Result format: https://github.com/containernetworking/cni/blob/master/SPEC.md#result and there seems to be no official back-channel from CNI back to pod to pass environment variables or anything like that, so Multus doesn't seem to be a good place for mapping calculation as I originally suggested. On the other side, SR-IOV DP is not the right place either, since it doesn't deal with CRDs.
But I've heard there are plans to add a new mutating hook for SR-IOV DP that would fill in requests / limits based on linked network CRDs. The same hook could also calculate and pass the mapping, via an environment variable or something else.
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
mapping each PCI device to its network sounds useful for other purposes, but I'm not sure why it is relevant here. A pod application can easily iterate over all |
P.S the choice of envvar name should allow allocation of two devices of the same resourceName. A Pod may very well request access to vlan1 and vlan2 on sriovA. |
@dankenigsberg @booxter Thanks for your comments. There are plans to add the allocated device id in the network status as part of Result. So, this information will be available from Pod status. But for applications to get this info will need to communicate to API server. Env variable is a convenient way of getting this information. It would be useful to explore the possibility of utilizing the same hook to have this resourceName env translated to a well-known name. Then again we run into the same problem as discussed above with multiple resources. For now, I think we could include some common prefix such as: |
I think that |
Actually we don't have the overwrite problem now as it is(current implementation). That is because we have a unique variable name(resourceName). For multiple resource request with same resourceName, the PCI addresses will be appended to the same variable's value. With the one you described we will get 2 variables: Also, most likely the : and . are not allowed in var names, so those need be taken care of too. |
@dankenigsberg @ahalim-intel the issue with <common_prefix>_ naming scheme is that you don't know which Multus network each PCI ID belongs to (and with heterogeneous nodes it becomes important). Having resourceName encoded in variable names is definitely the way to go. In KubeVirt context, we still miss the link between Multus networkName and resourceName. While we can calculate and pass it into the pod in KubeVirt, I was wondering if having it implemented in more broad context for all pods would be more helpful. Sadly, CNI API doesn't seem to provide a way to communicate such mapping back to pod. I think we could have a mutating hook that would calculate the mapping and pass through a variable on a pod. I will write a PoC showing the solution for KubeVirt, and we can discuss from there if something similar should be implemented in more general context. |
I understand why matching a network device to the network it connects to is interesting. But it is interesting in non-sriov networks just as much. Hence I don't think that it should do delay this pr, or merit a complex solution such as calling kAPI from inside each Pod. I don't understand the need in a kubevirt PoC. For kubevirt, you can match a pci device based on the mac address that sriov-cni he set on it. But I'll wait and see. |
@dankenigsberg I don't think we should delay the PR (except for extending per-pool variable names to full resource Name, with intel.com/ prefix). Yes, mapping between networks and resource names is useful beyond this particular DP. Who would be the component to serve it to pods? Multus? @dankenigsberg please explain your solution for KubeVirt particular concern. How do we know the MAC address allocated by CNI to an interface? |
@rkamudhan AFAIK we still use a shortened resourceName (without intel.com prefix) as the variable name. If we'd like to have it extendable to other potential solutions, it would be nice if we could have the prefix a part of the variable name so e.g. INTEL_COM_sriov_vfio=,,... instead of current sriov_vfio=,,... Otherwise we are left to assume the intel.com/ prefix in annotations and cut it off before accessing the variable. |
Change-Id: Ieda50d0eff71df7869dcec4533d3c16bd54fd275
e865f2c
to
aded17b
Compare
* added support for multiple resource endpoints This changes will add following new features to sriov-device-pLlugin - Handles SRIOV capable/not-capable devices (NICs and Accelerators alike) - Supports devices with both Kernel and userspace(uio and VFIO) drivers - Supports PF bound to DPDK driver to meet certain use-cases - Allow grouping together multiple PCI devices as one aggregated resource pool - Can represent each PF as a separately addressable resource pool to K8s - User configurable resourceName - Detects Kubelet restarts and auto-re-register - Detects Link status (for Linux network devices) and updates associated VFs health accordingly - Extensible to support new device types with minimal effort if not already supported Change-Id: I314e1ea75e647e747bcfdb2f447e593adbcd14c5 * changes based on review Change-Id: I802e3d8bf2893f4975483d68da459f49c9f7bb80 * corrected network annotaion for pod-tc4.yaml Change-Id: I46fc648f1b400bb55f207d5a24cf91571dbd1a87 * fixed updated device list on health change updated device were appended instead of a new list. This is corrected. Change-Id: Ia10c53209872cd691d4c83fe8cbfdc4e91650693 * minor fixes and typos Change-Id: Ieda50d0eff71df7869dcec4533d3c16bd54fd275
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
TODO: extract resourceName from CRD. TODO: keep it backwards compatible with previous DP version. TODO: tests. This is to align kubevirt with support for multiple allocation pools in SR-IOV DP: k8snetworkplumbingwg/sriov-network-device-plugin#26
TODO: grant access to networks to virt-controller. TODO: keep it backwards compatible with previous DP version. TODO: handle errors from network client. TODO: fix tests. Consider: - cache networks. This is to align kubevirt with support for multiple allocation pools in SR-IOV DP: k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: fix tests [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
The type extracts PCI IDs from SRIOV-VF-PCI-ADDR environment variable set by SR-IOV CNI plugin; and configures hostdev libvirt devices that corresponds to the extracted IDs. The change mounts additional host mounts for /sys and /dev to allow containerized libvirt / qemu to plug pci devices using vfio kernel interface. Note that at this moment, SR-IOV enabled VMIs run their virt-launcher pods privileged to allow qemu open /dev/vfio/NN devices. We don't know in advance the name of the device until we create and start the pod, at which point SR-IOV DP allocates a PCI ID to the pod that can be mapped to its IOMMU group number and hence /dev/vfio/NN device. In the future, SR-IOV DP will register the /dev/vfio/NN device with device cgroup, at which point we will be able to drop the privileged mode. (Additional capabilities like SYS_RESOURCE and SYS_RAWIO are still needed.) This work is tracked in: k8snetworkplumbingwg/sriov-network-device-plugin#26
This patch adopts changes from the latest SR-IOV device plugin master [1] that allow us to attach a VMI to multiple SR-IOV Multus networks. (Before the change, devices allocated for multiple networks could end up in different boot order than configured.) Among other things, this patch adds read access to network attachment definitions to virt-controller role. This is to allow the controller to fetch network CRD objects and determine their respective resource names. For this same reason, it extends KubevirtClient object to include NetworkClient(). We do *not* maintain backwards compatibility with older SR-IOV device plugin releases. This is as intended because SR-IOV support is still considered experimental, and there is no good reason for us to support the device plugin release that does not support multiple networks. Updated documentation to reflect the fact that kubevirt is now compatible with device plugin master (and *not* compatible with v1.0). todo: get rid of vendor/* change [1] k8snetworkplumbingwg/sriov-network-device-plugin#26
This changes will add following new features to sriov-device-pLlugin
Change-Id: I314e1ea75e647e747bcfdb2f447e593adbcd14c5