[libvirt PATCH] RFC: Add support for vDPA network devices

Laine Stump laine at redhat.com
Thu Aug 20 22:56:48 UTC 2020


On 8/18/20 2:37 PM, Jonathon Jongsma wrote:
> vDPA network devices allow high-performance networking in a virtual
> machine by providing a wire-speed data path. These devices require a
> vendor-specific host driver but the data path follows the virtio
> specification.
>
> The support for vDPA devices was recently added to qemu. This allows
> libvirt to support these devices. It requires that the device is
> configured on the host with the appropriate vendor-specific driver.
> This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That
> chardev path can then be used to define a new interface with
> type='vdpa'.
> ---
>   docs/formatdomain.rst                         | 20 +++++++++
>   docs/schemas/domaincommon.rng                 | 15 +++++++
>   src/conf/domain_conf.c                        | 41 +++++++++++++++++++
>   src/conf/domain_conf.h                        |  4 ++
>   src/conf/netdev_bandwidth_conf.c              |  1 +
>   src/libxl/libxl_conf.c                        |  1 +
>   src/libxl/xen_common.c                        |  1 +
>   src/lxc/lxc_controller.c                      |  1 +
>   src/lxc/lxc_driver.c                          |  3 ++
>   src/lxc/lxc_process.c                         |  1 +
>   src/qemu/qemu_command.c                       | 29 ++++++++++++-
>   src/qemu/qemu_command.h                       |  3 +-
>   src/qemu/qemu_domain.c                        |  6 ++-
>   src/qemu/qemu_hotplug.c                       | 15 ++++---
>   src/qemu/qemu_interface.c                     | 25 +++++++++++
>   src/qemu/qemu_interface.h                     |  2 +
>   src/qemu/qemu_process.c                       |  1 +
>   src/qemu/qemu_validate.c                      |  1 +
>   src/vmx/vmx.c                                 |  1 +
>   .../net-vdpa.x86_64-latest.args               | 37 +++++++++++++++++
>   tests/qemuxml2argvdata/net-vdpa.xml           | 28 +++++++++++++
>   tests/qemuxml2argvmock.c                      | 11 ++++-
>   tests/qemuxml2argvtest.c                      |  1 +
>   tests/qemuxml2xmloutdata/net-vdpa.xml         | 34 +++++++++++++++
>   tests/qemuxml2xmltest.c                       |  1 +
>   tools/virsh-domain.c                          |  1 +
>   26 files changed, 274 insertions(+), 10 deletions(-)
>   create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
>   create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml
>   create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml


I would have had fewer excuses to procrastinate in looking at this if it 
was broken up into smaller patches. At least one patch for the change to 
the XML schema/parser/formatter, and xml2xml test case, and a bit of 
docs in formatdomain, then another putting the support into qemu for 
that bit of config.


> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> index 8365fc8bbb..1356485504 100644
> --- a/docs/formatdomain.rst
> +++ b/docs/formatdomain.rst
> @@ -4632,6 +4632,26 @@ or stopping the guest.
>      </devices>
>      ...
>   
> +:anchor:`<a id="elementsNICSVDPA"/>`
> +
> +vDPA devices
> +^^^^^^^^^^^^
> +
> +A vDPA device can be used to provide wire speed network performance within a
> +domain. The host device must already be configured with the appropriate
> +device-specific vDPA driver. This creates a vDPA char device (e.g.
> +/dev/vhost-vdpa-0) that can be used to assign the device to a libvirt domain.


Maybe at least mention here that this only works with certain models of 
SR-IOV NICs, and that each guest vdpa uses up one SR-IOV VF on the host. 
Otherwise we'll get people seeing the "wirespeed performance" part, then 
trying to figure out how to set it up using their ISA bus NE2000 NIC or 
something :-)


> +
> +::
> +
> +   ...
> +   <devices>
> +     <interface type='vdpa'>
> +       <source dev='/dev/vhost-vdpa-0'/>


(The above device is created (I just learned this from you in IRC!) by 
unbinding a VF from its NIC driver on the host, and re-binding it to a 
special VDPA-VF driver.)


As we were just discussing online, on one hand it could be nice if 
libvirt could automatically handle rebinding the VF to the vdpa host 
driver (given the PCI address of the VF), to make it easier to use 
(because no advance setup would be needed), similar to what's already 
done with hostdev devices (and <interface type='hostdev'>) when 
managed='yes' (which is the default setting).


On the other hand, it is exactly that managed='yes' functionality that 
has created more "libvirt-but-not-really-libvirt" bug reports than any 
other aspect of vfio device assignment, because the process of unbinding 
and rebinding drivers is timing-sensitive and causes code that's usually 
run only once at host boot-time to be run hundreds of times thus making 
it more likely to expose infrequently-hit bugs.


I just bring this up in advance of someone suggesting the addition of 
managed='yes' here to put in my vote for *not* doing it, and instead 
using that same effort to provide some sort of API in the node-device 
driver for easily creating one or more VDPA devices from VFs, which 
could be done once at host boot time, and thus avoid the level of 
"libvirt-not-libvirt" bug reports for VDPA. (and after that maybe even 
an API to allocate a device from that pool to be used by a guest). But 
that's for later.


> +     </interface>
> +   </devices>
> +   ...
> +
>   :anchor:`<a id="elementsTeaming"/>`
>   
>   Teaming a virtio/hostdev NIC pair
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 0d0dcbc5ce..17f74490f4 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -3108,6 +3108,21 @@
>               <ref name="interface-options"/>
>             </interleave>
>           </group>
> +
> +        <group>
> +          <attribute name="type">
> +            <value>vdpa</value>
> +          </attribute>
> +          <interleave>
> +            <element name="source">
> +              <attribute name="dev">
> +                <ref name="deviceName"/>
> +              </attribute>
> +            </element>
> +            <ref name="interface-options"/>
> +          </interleave>
> +        </group>
> +
>         </choice>
>         <optional>
>           <attribute name="trustGuestRxFilters">
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 8e7981bf25..74f2c2f3e3 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -549,6 +549,7 @@ VIR_ENUM_IMPL(virDomainNet,
>                 "direct",
>                 "hostdev",
>                 "udp",
> +              "vdpa",
>   );
>   
>   VIR_ENUM_IMPL(virDomainNetModel,
> @@ -2495,6 +2496,10 @@ virDomainNetDefClear(virDomainNetDefPtr def)
>           def->data.vhostuser = NULL;
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        VIR_FREE(def->data.vdpa.devicepath);
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_SERVER:
>       case VIR_DOMAIN_NET_TYPE_CLIENT:
>       case VIR_DOMAIN_NET_TYPE_MCAST:
> @@ -6489,6 +6494,15 @@ virDomainNetDefValidate(const virDomainNetDef *net)
>           return -1;
>       }
>   
> +    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
> +        net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) {
> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> +                           _("invalid model for interface of type '%s': '%s'"),
> +                           virDomainNetTypeToString(net->type),
> +                           virDomainNetModelTypeToString(net->model));
> +            return -1;
> +    }
> +


I see that in qemuDomainDeviceNetDefPostParse you set this to virtio if 
it isn't specified. It seems a bit odd to set the default in the 
qemu-specific post-parse, but check that the default actually *is* 
virtio in the generic domain validate. Since device models tend to be 
hypervisor-specific, I'm thinking maybe we should set an unspecified 
model to virtio where you currently have that 
(qemuDomainDeviceNetDefPostParse()) but more the above validation check 
from here over to qemuValidateDomainDeviceDefNetwork()


(Wow. This whole thing of having 4 (and even more in the case of NetDef, 
since there is a difference between define-time and runtime validation) 
separate places to check settings makes it really complicated to decide 
on the correct place to put one tiny check. It's tough to even remember 
where they are and what they're called - I have to do a chain of cscope 
searches every single time I get into the subject!)


(P.S. I just noticed that vhost-user, which also uses the virtio-net 
backend, just has a check directly at the end of 
virDomainNetDefParseXML() that checks if model = virtio was specified, 
and if not it logs an error and fails.)(which points out *yet another* 
place that inputs are validated. Sigh.)


>       return 0;
>   }
>   
> @@ -11982,6 +11996,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>       g_autofree char *vhost_path = NULL;
>       g_autofree char *teamingType = NULL;
>       g_autofree char *teamingPersistent = NULL;
> +    g_autofree char *vdpa_dev = NULL;
>       const char *prefix = xmlopt ? xmlopt->config.netPrefix : NULL;
>   
>       if (!(def = virDomainNetDefNew(xmlopt)))
> @@ -12075,6 +12090,10 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>                   if (virDomainChrSourceReconnectDefParseXML(&reconnect, cur, ctxt) < 0)
>                       goto error;
>   
> +            } else if (!vdpa_dev
> +                       && def->type == VIR_DOMAIN_NET_TYPE_VDPA
> +                       && virXMLNodeNameEqual(cur, "source")) {
> +                vdpa_dev = virXMLPropString(cur, "dev");


(it's always kind of bugged me that in so many places we just ignore 
multiple definitions of the same element in our parsing, rather than 
logging an error. But this pattern has so much precedent that I'm not 
going to say anything about it. Oops, already did. Forget I said that.)


>               } else if (!def->virtPortProfile
>                          && virXMLNodeNameEqual(cur, "virtualport")) {
>                   if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) {
> @@ -12332,6 +12351,16 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>           }
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        if (vdpa_dev == NULL) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("No <source> 'dev' attribute "
> +                             "specified with <interface type='vdpa'/>"));
> +            goto error;
> +        }
> +        def->data.vdpa.devicepath = g_steal_pointer(&vdpa_dev);
> +        break;
> +


Yeah, this is the place I was talking about before. It used to be that 
this was the place to check for anything that *must* be there no matter 
what the hypervisor. I still don't get exactly what is the status of 
these checks at the end of the parse functions; do we want to deprecate 
them? Or should we still add more stuff as long as it's okay to log an 
error even when we're reading existing XML from disk? Should someone be 
moving the entire switch statement containing this chunk into 
virDomainNetDefValidate()?)


>       case VIR_DOMAIN_NET_TYPE_BRIDGE:
>           if (bridge == NULL) {
>               virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> @@ -12727,6 +12756,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>           case VIR_DOMAIN_NET_TYPE_DIRECT:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               break;
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           default:
> @@ -26737,6 +26767,14 @@ virDomainNetDefFormat(virBufferPtr buf,
>               }
>               break;
>   
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
> +           if (def->data.vdpa.devicepath) {
> +               virBufferEscapeString(buf, "<source dev='%s'",
> +                                     def->data.vdpa.devicepath);
> +               sourceLines++;
> +           }
> +            break;
> +
>           case VIR_DOMAIN_NET_TYPE_USER:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               break;
> @@ -30902,6 +30940,7 @@ virDomainNetGetActualVirtPortProfile(const virDomainNetDef *iface)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>       default:
>           return NULL;
> @@ -31718,6 +31757,7 @@ virDomainNetTypeSharesHostView(const virDomainNetDef *net)
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> @@ -31982,6 +32022,7 @@ virDomainNetDefActualToNetworkPort(virDomainDefPtr dom,
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_USER:
>       case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                          _("Unexpected network port type %s"),
>                          virDomainNetTypeToString(virDomainNetGetActualType(iface)));
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 68be32614c..4f63a3eef4 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -872,6 +872,7 @@ typedef enum {
>       VIR_DOMAIN_NET_TYPE_DIRECT,
>       VIR_DOMAIN_NET_TYPE_HOSTDEV,
>       VIR_DOMAIN_NET_TYPE_UDP,
> +    VIR_DOMAIN_NET_TYPE_VDPA,
>   
>       VIR_DOMAIN_NET_TYPE_LAST
>   } virDomainNetType;
> @@ -1045,6 +1046,9 @@ struct _virDomainNetDef {
>                */
>               virDomainActualNetDefPtr actual;
>           } network;
> +        struct {
> +            char *devicepath;
> +        } vdpa;
>           struct {
>               char *brname;
>           } bridge;
> diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c
> index 396ac62019..4eb12e2951 100644
> --- a/src/conf/netdev_bandwidth_conf.c
> +++ b/src/conf/netdev_bandwidth_conf.c
> @@ -315,6 +315,7 @@ bool virNetDevSupportsBandwidth(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
> index 7c2c015015..709cdc8719 100644
> --- a/src/libxl/libxl_conf.c
> +++ b/src/libxl/libxl_conf.c
> @@ -1371,6 +1371,7 @@ libxlMakeNic(virDomainDefPtr def,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_DIRECT:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                       _("unsupported interface type %s"),
> diff --git a/src/libxl/xen_common.c b/src/libxl/xen_common.c
> index 75fe7e0644..b1ec34bf11 100644
> --- a/src/libxl/xen_common.c
> +++ b/src/libxl/xen_common.c
> @@ -1776,6 +1776,7 @@ xenFormatNet(virConnectPtr conn,
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_USER:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
>                          virDomainNetTypeToString(net->type));
>           return -1;
> diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
> index ae6b737b60..cb573d6c01 100644
> --- a/src/lxc/lxc_controller.c
> +++ b/src/lxc/lxc_controller.c
> @@ -422,6 +422,7 @@ static int virLXCControllerGetNICIndexes(virLXCControllerPtr ctrl)
>           case VIR_DOMAIN_NET_TYPE_UDP:
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                              _("Unsupported net type %s"),
>                              virDomainNetTypeToString(actualType));
> diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c
> index 1cdd6ee455..a36f83a588 100644
> --- a/src/lxc/lxc_driver.c
> +++ b/src/lxc/lxc_driver.c
> @@ -3503,6 +3503,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>                          _("Network device type is not supported"));
>           goto cleanup;
> @@ -3557,6 +3558,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           default:
>               /* no-op */
> @@ -3998,6 +4000,7 @@ lxcDomainDetachDeviceNetLive(virDomainObjPtr vm,
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>                          _("Only bridged veth devices can be detached"));
>           goto cleanup;
> diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
> index fc59c2e5af..90e9790cea 100644
> --- a/src/lxc/lxc_process.c
> +++ b/src/lxc/lxc_process.c
> @@ -606,6 +606,7 @@ virLXCProcessSetupInterfaces(virLXCDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               virReportError(VIR_ERR_INTERNAL_ERROR,
>                              _("Unsupported network type %s"),
>                              virDomainNetTypeToString(type));
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 01812cd39b..9c5265ccdf 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3552,7 +3552,8 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
>                       size_t tapfdSize,
>                       char **vhostfd,
>                       size_t vhostfdSize,
> -                    const char *slirpfd)
> +                    const char *slirpfd,
> +                    const char *vdpafd)
>   {
>       bool is_tap = false;
>       virDomainNetType netType = virDomainNetGetActualType(net);
> @@ -3690,6 +3691,13 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
>               return NULL;
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        /* Caller will pass the fd to qemu with add-fd */
> +        if (virJSONValueObjectCreate(&netprops, "s:type", "vhost-vdpa", NULL) < 0 ||
> +            virJSONValueObjectAppendString(netprops, "vhostdev", vdpafd) < 0)
> +            return NULL;
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           /* Should have been handled earlier via PCI/USB hotplug code. */
>       case VIR_DOMAIN_NET_TYPE_LAST:
> @@ -8013,6 +8021,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>       char **tapfdName = NULL;
>       char **vhostfdName = NULL;
>       g_autofree char *slirpfdName = NULL;
> +    g_autofree char *vdpafdName = NULL;
> +    int vdpafd = -1;
>       virDomainNetType actualType = virDomainNetGetActualType(net);
>       const virNetDevBandwidth *actualBandwidth;
>       bool requireNicdev = false;
> @@ -8098,6 +8108,11 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>   
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0)
> +            goto cleanup;
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_USER:
>       case VIR_DOMAIN_NET_TYPE_SERVER:
>       case VIR_DOMAIN_NET_TYPE_CLIENT:
> @@ -8140,6 +8155,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>          /* These types don't use a network device on the host, but
>           * instead use some other type of connection to the emulated
> @@ -8219,13 +8235,22 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>           vhostfd[i] = -1;
>       }
>   
> +    if (vdpafd > 0) {
> +        virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
> +        g_autofree char *fdset = qemuVirCommandGetFDSet(cmd, vdpafd);
> +        if (!fdset)
> +            goto cleanup;
> +        virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
> +        vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd);
> +    }
> +
>       if (chardev)
>           virCommandAddArgList(cmd, "-chardev", chardev, NULL);
>   
>       if (!(hostnetprops = qemuBuildHostNetStr(net,
>                                                tapfdName, tapfdSize,
>                                                vhostfdName, vhostfdSize,
> -                                             slirpfdName)))
> +                                             slirpfdName, vdpafdName)))
>           goto cleanup;
>   
>       if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops,
> diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h
> index 89d99b111f..e8b4f4785a 100644
> --- a/src/qemu/qemu_command.h
> +++ b/src/qemu/qemu_command.h
> @@ -99,7 +99,8 @@ virJSONValuePtr qemuBuildHostNetStr(virDomainNetDefPtr net,
>                                       size_t tapfdSize,
>                                       char **vhostfd,
>                                       size_t vhostfdSize,
> -                                    const char *slirpfd);
> +                                    const char *slirpfd,
> +                                    const char *vdpafd);
>   
>   /* Current, best practice */
>   char *qemuBuildNicDevStr(virDomainDefPtr def,
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index c440c79e1d..daae5a1b03 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -5027,7 +5027,10 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net,
>                                   const virDomainDef *def,
>                                   virQEMUCapsPtr qemuCaps)
>   {
> -    if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
> +    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
> +        !virDomainNetGetModelString(net))
> +        net->model = VIR_DOMAIN_NET_MODEL_VIRTIO;
> +    else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
>           !virDomainNetGetModelString(net) &&
>           virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV)
>           net->model = qemuDomainDefaultNetModel(def, qemuCaps);
> @@ -9201,6 +9204,7 @@ qemuDomainNetSupportsMTU(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_DIRECT:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> index 2c6c30ce03..23ae2310a2 100644
> --- a/src/qemu/qemu_hotplug.c
> +++ b/src/qemu/qemu_hotplug.c
> @@ -1340,6 +1340,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("hotplug of interface type of %s is not implemented yet"),
> @@ -1388,7 +1389,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
>       if (!(netprops = qemuBuildHostNetStr(net,
>                                            tapfdName, tapfdSize,
>                                            vhostfdName, vhostfdSize,
> -                                         slirpfdName)))
> +                                         slirpfdName, NULL)))
>           goto cleanup;
>   
>       qemuDomainObjEnterMonitor(driver, vm);
> @@ -3390,6 +3391,7 @@ qemuDomainChangeNetFilter(virDomainObjPtr vm,
>       case VIR_DOMAIN_NET_TYPE_DIRECT:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                          _("filters not supported on interfaces of type %s"),
>                          virDomainNetTypeToString(virDomainNetGetActualType(newdev)));
> @@ -3483,8 +3485,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>       olddev = *devslot;
>   
>       oldType = virDomainNetGetActualType(olddev);
> -    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> -        /* no changes are possible to a type='hostdev' interface */
> +    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
> +        oldType == VIR_DOMAIN_NET_TYPE_VDPA) {
> +        /* no changes are possible to a type='hostdev' or type='vdpa' interface */
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("cannot change config of '%s' network type"),
>                          virDomainNetTypeToString(oldType));
> @@ -3671,8 +3674,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>   
>       newType = virDomainNetGetActualType(newdev);
>   
> -    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> -        /* can't turn it into a type='hostdev' interface */
> +    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
> +        newType == VIR_DOMAIN_NET_TYPE_VDPA) {
> +        /* can't turn it into a type='hostdev' or type='vdpa' interface */
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("cannot change network interface type to '%s'"),
>                          virDomainNetTypeToString(newType));
> @@ -3726,6 +3730,7 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>               break;
>   
>           case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>               virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                              _("unable to change config on '%s' network type"),
> diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c
> index ffec992596..676648ebab 100644
> --- a/src/qemu/qemu_interface.c
> +++ b/src/qemu/qemu_interface.c
> @@ -118,6 +118,7 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           /* these types all require no action */
>           break;
> @@ -203,6 +204,7 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           /* these types all require no action */
>           break;
> @@ -630,6 +632,29 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def,
>   }
>   
>   
> +/* qemuInterfaceVDPAConnect:
> + * @net: pointer to the VM's interface description
> + *
> + * returns: file descriptor of the vdpa device
> + *
> + * Called *only* called if actualType is VIR_DOMAIN_NET_TYPE_VDPA
> + */
> +int
> +qemuInterfaceVDPAConnect(virDomainNetDefPtr net)
> +{
> +    int fd;
> +
> +    if ((fd = open(net->data.vdpa.devicepath, O_RDWR)) < 0) {
> +        virReportSystemError(errno,
> +                             _("Unable to open '%s' for vdpa device"),
> +                             net->data.vdpa.devicepath);
> +        return -1;
> +    }
> +
> +    return fd;
> +}
> +
> +
>   qemuSlirpPtr
>   qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
>                             virDomainNetDefPtr net)
> diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h
> index 3dcefc6a12..1ba24f0a6f 100644
> --- a/src/qemu/qemu_interface.h
> +++ b/src/qemu/qemu_interface.h
> @@ -58,3 +58,5 @@ int qemuInterfaceOpenVhostNet(virDomainDefPtr def,
>   
>   qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
>                                          virDomainNetDefPtr net);
> +
> +int qemuInterfaceVDPAConnect(virDomainNetDefPtr net) G_GNUC_NO_INLINE;
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 126fabf5ef..70c3b9b46d 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -7517,6 +7517,7 @@ void qemuProcessStop(virQEMUDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               /* No special cleanup procedure for these types. */
>               break;
> diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c
> index 488f258d00..623f998463 100644
> --- a/src/qemu/qemu_validate.c
> +++ b/src/qemu/qemu_validate.c
> @@ -1130,6 +1130,7 @@ qemuValidateNetSupportsCoalesce(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c
> index a123a8807c..f6f6efb322 100644
> --- a/src/vmx/vmx.c
> +++ b/src/vmx/vmx.c
> @@ -3833,6 +3833,7 @@ virVMXFormatEthernet(virDomainNetDefPtr def, int controller,
>         case VIR_DOMAIN_NET_TYPE_DIRECT:
>         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>         case VIR_DOMAIN_NET_TYPE_UDP:
> +      case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
>                          virDomainNetTypeToString(def->type));
>           return -1;
> diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> new file mode 100644
> index 0000000000..8e76ac7794
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> @@ -0,0 +1,37 @@
> +LC_ALL=C \
> +PATH=/bin \
> +HOME=/tmp/lib/domain--1-QEMUGuest1 \
> +USER=test \
> +LOGNAME=test \
> +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
> +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
> +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
> +QEMU_AUDIO_DRV=none \
> +/usr/bin/qemu-system-i386 \
> +-name guest=QEMUGuest1,debug-threads=on \
> +-S \
> +-object secret,id=masterKey0,format=raw,\
> +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
> +-machine pc,accel=tcg,usb=off,dump-guest-core=off \
> +-cpu qemu64 \
> +-m 214 \
> +-overcommit mem-lock=off \
> +-smp 1,sockets=1,cores=1,threads=1 \
> +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
> +-display none \
> +-no-user-config \
> +-nodefaults \
> +-chardev socket,id=charmonitor,fd=1729,server,nowait \
> +-mon chardev=charmonitor,id=monitor,mode=control \
> +-rtc base=utc \
> +-no-shutdown \
> +-no-acpi \
> +-boot strict=on \
> +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
> +-add-fd set=0,fd=1732 \
> +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \


Okay, I'm feeling too lazy to parse through the code above an see how 
you arrived at "vhostdev='/dev/fdset/0'", but that doesn't look right. 
Shouldn't you be ending up with "-netdev vhost-vdpa,fd=NN,..."? The 
document I have shows that syntax is supported, so there shouldn't be 
any need to do the add-fd stuff in this case.


I think the next step should be to find some hardware and give this a 
smoke test! :-)


> +-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:95:db:c0,bus=pci.0,\
> +addr=0x2 \
> +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\
> +resourcecontrol=deny \
> +-msg timestamp=on
> diff --git a/tests/qemuxml2argvdata/net-vdpa.xml b/tests/qemuxml2argvdata/net-vdpa.xml
> new file mode 100644
> index 0000000000..30cca7eb6e
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/net-vdpa.xml
> @@ -0,0 +1,28 @@
> +<domain type='qemu'>
> +  <name>QEMUGuest1</name>
> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
> +  <memory unit='KiB'>219136</memory>
> +  <currentMemory unit='KiB'>219136</currentMemory>
> +  <vcpu placement='static'>1</vcpu>
> +  <os>
> +    <type arch='i686' machine='pc'>hvm</type>
> +    <boot dev='hd'/>
> +  </os>
> +  <clock offset='utc'/>
> +  <on_poweroff>destroy</on_poweroff>
> +  <on_reboot>restart</on_reboot>
> +  <on_crash>destroy</on_crash>
> +  <devices>
> +    <emulator>/usr/bin/qemu-system-i386</emulator>
> +    <controller type='usb' index='0'/>
> +    <controller type='ide' index='0'/>
> +    <controller type='pci' index='0' model='pci-root'/>
> +    <interface type='vdpa'>
> +      <mac address='52:54:00:95:db:c0'/>
> +      <source dev='/dev/vhost-vdpa-0'/>
> +    </interface>
> +    <input type='mouse' bus='ps2'/>
> +    <input type='keyboard' bus='ps2'/>
> +    <memballoon model='none'/>
> +  </devices>
> +</domain>
> diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c
> index e5841bc8e3..516776697f 100644
> --- a/tests/qemuxml2argvmock.c
> +++ b/tests/qemuxml2argvmock.c
> @@ -205,7 +205,7 @@ virHostGetDRMRenderNode(void)
>   
>   static void (*real_virCommandPassFD)(virCommandPtr cmd, int fd, unsigned int flags);
>   
> -static const int testCommandPassSafeFDs[] = { 1730, 1731 };
> +static const int testCommandPassSafeFDs[] = { 1730, 1731, 1732 };
>   
>   void
>   virCommandPassFD(virCommandPtr cmd,
> @@ -283,3 +283,12 @@ qemuBuildTPMOpenBackendFDs(const char *tpmdev G_GNUC_UNUSED,
>       *cancelfd = 1731;
>       return 0;
>   }
> +
> +
> +int
> +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED)
> +{
> +    if (fcntl(1732, F_GETFD) != -1)
> +        abort();
> +    return 1732;
> +}
> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
> index 01839cb88c..9587e1f2f2 100644
> --- a/tests/qemuxml2argvtest.c
> +++ b/tests/qemuxml2argvtest.c
> @@ -1446,6 +1446,7 @@ mymain(void)
>               QEMU_CAPS_DEVICE_VFIO_PCI);
>       DO_TEST_FAILURE("net-hostdev-fail",
>                       QEMU_CAPS_DEVICE_VFIO_PCI);
> +    DO_TEST_CAPS_LATEST("net-vdpa");
>   
>       DO_TEST("hostdev-pci-multifunction",
>               QEMU_CAPS_KVM,
> diff --git a/tests/qemuxml2xmloutdata/net-vdpa.xml b/tests/qemuxml2xmloutdata/net-vdpa.xml
> new file mode 100644
> index 0000000000..b362405c14
> --- /dev/null
> +++ b/tests/qemuxml2xmloutdata/net-vdpa.xml
> @@ -0,0 +1,34 @@
> +<domain type='qemu'>
> +  <name>QEMUGuest1</name>
> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
> +  <memory unit='KiB'>219136</memory>
> +  <currentMemory unit='KiB'>219136</currentMemory>
> +  <vcpu placement='static'>1</vcpu>
> +  <os>
> +    <type arch='i686' machine='pc'>hvm</type>
> +    <boot dev='hd'/>
> +  </os>
> +  <clock offset='utc'/>
> +  <on_poweroff>destroy</on_poweroff>
> +  <on_reboot>restart</on_reboot>
> +  <on_crash>destroy</on_crash>
> +  <devices>
> +    <emulator>/usr/bin/qemu-system-i386</emulator>
> +    <controller type='usb' index='0'>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
> +    </controller>
> +    <controller type='ide' index='0'>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
> +    </controller>
> +    <controller type='pci' index='0' model='pci-root'/>
> +    <interface type='vdpa'>
> +      <mac address='52:54:00:95:db:c0'/>
> +      <source dev='/dev/vhost-vdpa-0'/>
> +      <model type='virtio'/>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
> +    </interface>
> +    <input type='mouse' bus='ps2'/>
> +    <input type='keyboard' bus='ps2'/>
> +    <memballoon model='none'/>
> +  </devices>
> +</domain>
> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
> index a07e2b7553..978babb110 100644
> --- a/tests/qemuxml2xmltest.c
> +++ b/tests/qemuxml2xmltest.c
> @@ -494,6 +494,7 @@ mymain(void)
>       DO_TEST("net-mtu", NONE);
>       DO_TEST("net-coalesce", NONE);
>       DO_TEST("net-many-models", NONE);
> +    DO_TEST("net-vdpa", NONE);
>   
>       DO_TEST("serial-tcp-tlsx509-chardev", NONE);
>       DO_TEST("serial-tcp-tlsx509-chardev-notls", NONE);
> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
> index 286cf79671..10b396bcf0 100644
> --- a/tools/virsh-domain.c
> +++ b/tools/virsh-domain.c
> @@ -1007,6 +1007,7 @@ cmdAttachInterface(vshControl *ctl, const vshCmd *cmd)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           vshError(ctl, _("No support for %s in command 'attach-interface'"),
>                    type);





More information about the libvir-list mailing list