[libvirt] [PATCH 05/15] Forward Mode Hostdev network driver Implementation
Laine Stump
laine at laine.org
Tue Aug 14 06:08:27 UTC 2012
On 08/10/2012 12:24 PM, Shradha Shah wrote:
> This patch updates the network driver to properly utilize the new
> attributes/elements that are now in virNetworkDef
Some minor nits, nothing major, but still needs one more iteration.
I'll try to look at the hostdev-hybrid patches in the morning...
>
> Signed-off-by: Shradha Shah <sshah at solarflare.com>
> ---
> docs/formatnetwork.html.in | 62 +++++++++++
> src/network/bridge_driver.c | 237 ++++++++++++++++++++++++++++++++++--------
> 2 files changed, 254 insertions(+), 45 deletions(-)
>
> diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in
> index 7e8e991..96b9eb2 100644
> --- a/docs/formatnetwork.html.in
> +++ b/docs/formatnetwork.html.in
> @@ -210,6 +210,37 @@
> (usually either a domain start, or a hotplug interface
> attach to a domain).<span class="since">Since 0.9.4</span>
> </dd>
> + <dt><code>hostdev</code></dt>
> + <dd>
> + This network facilitates PCI Passthrough of a network device.
> + A network device is chosen from the interface pool and
> + directly assigned to the guest using generic device
> + passthrough, after first optionally setting the device's MAC
> + address to the configured value, and associating the device with
You should probably say "optionally associating" here too, to make sure
nobody misunderstands and thinks the 802.1Qbh part is mandatory.
> + an 802.1Qbh capable switch using an optionally specified
> + <code><virtualport></code> element.
> + Note that - due to limitations in standard single-port PCI
> + ethernet card driver design - only SR-IOV (Single Root I/O
> + Virtualization) virtual function (VF) devices can be assigned
> + in this manner; to assign a standard single-port PCI or PCIe
> + ethernet card to a guest, use the traditional <code><
> + hostdev></code> device definition and <span class="since">
s/definition and/definition./
> + Since 0.9.12</span>
All of the 0.9.12's need to be changed to 0.10.0.
> +
> + <p>Note that this "intelligent passthrough" of network devices is
> + very similar to the functionality of a standard <code><
> + hostdev></code> device, the difference being that this
> + method allows specifying a MAC address and <code><virtualport
> + ></code> for the passed-through device. If these capabilities
> + are not required, if you have a standard single-port PCI, PCIe,
> + or USB network card that doesn't support SR-IOV (and hence would
> + anyway lose the configured MAC address during reset after being
> + assigned to the guest domain), or if you are using a version of
> + libvirt older than 0.9.12, you should use standard
s/use/use a/
> + <code><hostdev></code> to assign the device to the
s/to assign/definition to assign/
> + guest instead of <code><forward mode='hostdev'/></code>.
> + </p>
> + </dd>
> </dl>
> As mentioned above, a <code><forward></code> element can
> have multiple <code><interface></code> subelements, each
> @@ -249,6 +280,37 @@
> particular, 'passthrough' mode, and 'private' mode when using
> 802.1Qbh), libvirt will choose an unused physical interface
> or, if it can't find an unused interface, fail the operation.</p>
> +
> + <span class="since">since 0.9.12</span> and when using forward mode
> + 'hostdev' we specify the interface pool by using the
> + <code><address></code> element and <code><
> + type></code> <code><domain></code> <code><bus></code>
> + <code><slot></code> and <code><function></code>
> + sub-elements.
> +
> + <pre>
> +...
> + <forward mode='hostdev' managed='yes'>
> + <address type='pci' domain='0' bus='4' slot='0' function='1'/>
> + <address type='pci' domain='0' bus='4' slot='0' function='2'/>
> + <address type='pci' domain='0' bus='4' slot='0' function='3'/>
> + </forward>
> +...
> + </pre>
> +
> + Alternatively the interface pool can also be mentioned using a
s/mentioned/defined/
> + single physical function <code><pf></code> subelement to
> + call out the corresponding physical interface associated with
> + multiple virtual interfaces (similar to the passthrough mode):
s/the passthrough/passthrough/
> +
> + <pre>
> +...
> + <forward mode='hostdev' managed='yes'>
> + <pf dev='eth0'/>
> + </forward>
> +...
> + </pre>
> +
> </dd>
> </dl>
> <h5><a name="elementQoS">Quality of service</a></h5>
> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
> index 602e17d..33bc09e 100644
> --- a/src/network/bridge_driver.c
> +++ b/src/network/bridge_driver.c
> @@ -1934,7 +1934,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED,
> virNetworkObjPtr network ATTRIBUTE_UNUSED)
> {
> /* put anything here that needs to be done each time a network of
> - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On
> + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On
> * failure, undo anything you've done, and return -1. On success
> * return 0.
> */
> @@ -1945,7 +1945,7 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT
> virNetworkObjPtr network ATTRIBUTE_UNUSED)
> {
> /* put anything here that needs to be done each time a network of
> - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is shutdown. On
> + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is shutdown. On
> * failure, undo anything you've done, and return -1. On success
> * return 0.
> */
> @@ -1976,6 +1976,7 @@ networkStartNetwork(struct network_driver *driver,
> case VIR_NETWORK_FORWARD_PRIVATE:
> case VIR_NETWORK_FORWARD_VEPA:
> case VIR_NETWORK_FORWARD_PASSTHROUGH:
> + case VIR_NETWORK_FORWARD_HOSTDEV:
> ret = networkStartNetworkExternal(driver, network);
> break;
> }
> @@ -2035,6 +2036,7 @@ static int networkShutdownNetwork(struct network_driver *driver,
> case VIR_NETWORK_FORWARD_PRIVATE:
> case VIR_NETWORK_FORWARD_VEPA:
> case VIR_NETWORK_FORWARD_PASSTHROUGH:
> + case VIR_NETWORK_FORWARD_HOSTDEV:
> ret = networkShutdownNetworkExternal(driver, network);
> break;
> }
> @@ -2778,7 +2780,7 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) {
> goto finish;
> }
> }
> - else if (netdef->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) {
> + if (netdef->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) {
> netdef->forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */
> netdef->forwardIfs[ii].device.pci.domain = virt_fns[ii]->domain;
> netdef->forwardIfs[ii].device.pci.bus = virt_fns[ii]->bus;
> @@ -2817,6 +2819,8 @@ networkAllocateActualDevice(virDomainNetDefPtr iface)
> virNetworkObjPtr network;
> virNetworkDefPtr netdef;
> virPortGroupDefPtr portgroup;
> + virNetDevVPortProfilePtr virtport = NULL;
> + virNetworkForwardIfDefPtr dev = NULL;
> int ii;
> int ret = -1;
>
> @@ -2869,6 +2873,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface)
> */
> if (iface->data.network.actual)
> iface->data.network.actual->type = VIR_DOMAIN_NET_TYPE_NETWORK;
> +
Extraneous whitespace change ^^^
> } else if ((netdef->forwardType == VIR_NETWORK_FORWARD_BRIDGE) &&
> netdef->bridge) {
>
> @@ -2889,11 +2894,74 @@ networkAllocateActualDevice(virDomainNetDefPtr iface)
> goto cleanup;
> }
>
> + } else if (netdef->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) {
> + if (!iface->data.network.actual
> + && (VIR_ALLOC(iface->data.network.actual) < 0)) {
> + virReportOOMError();
> + goto cleanup;
> + }
> +
> + iface->data.network.actual->type = VIR_DOMAIN_NET_TYPE_HOSTDEV;
> + if ((netdef->nForwardPfs > 0) && (netdef->nForwardIfs <= 0)) {
> + if(networkCreateInterfacePool(netdef) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Could not create Interface Pool from PF"));
networkCreateInterfacePool() logs its own error. Don't log another one.
> + goto cleanup;
> + }
> + }
> + /* pick first dev with 0 usageCount */
> +
> + for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> + if (netdef->forwardIfs[ii].usageCount == 0) {
> + dev = &netdef->forwardIfs[ii];
> + break;
> + }
> + }
> + if (!dev) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("network '%s' requires exclusive access to interfaces, but none are available"),
> + netdef->name);
> + goto cleanup;
> + }
> + iface->data.network.actual->data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NET;
> + iface->data.network.actual->data.hostdev.def.parent.data.net = iface;
> + iface->data.network.actual->data.hostdev.def.info = &iface->info;
> + iface->data.network.actual->data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS;
> + iface->data.network.actual->data.hostdev.def.managed = netdef->managed;
> + iface->data.network.actual->data.hostdev.def.source.subsys.type = dev->type;
> + iface->data.network.actual->data.hostdev.def.source.subsys.u.pci = dev->device.pci;
> +
> + if (iface->data.network.virtPortProfile) {
> + virtport = iface->data.network.virtPortProfile;
> + } else {
> + if (portgroup)
> + virtport = portgroup->virtPortProfile;
> + else
> + virtport = netdef->virtPortProfile;
> + }
> + if (virtport) {
> + if (VIR_ALLOC(iface->data.network.actual->data.hostdev.virtPortProfile) < 0) {
> + virReportOOMError();
> + goto cleanup;
> + }
> + /* There are no pointers in a virtualPortProfile, so a shallow copy
> + * is sufficient
> + */
> + *iface->data.network.actual->data.direct.virtPortProfile = *virtport;
> + }
All the above code will need to be removed when my virtualport merging
patch is pushed - it's all handled in common code now.
> +
> + dev->usageCount++;
> + VIR_DEBUG("Using physical device with domain=%d bus=%d slot=%d function=%d, usageCount %d",
> + dev->device.pci.domain,
> + dev->device.pci.bus,
> + dev->device.pci.slot,
> + dev->device.pci.function,
> + dev->usageCount);
You could combine some of those args to save lines...
> +
> } else if ((netdef->forwardType == VIR_NETWORK_FORWARD_BRIDGE) ||
> (netdef->forwardType == VIR_NETWORK_FORWARD_PRIVATE) ||
> (netdef->forwardType == VIR_NETWORK_FORWARD_VEPA) ||
> (netdef->forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH)) {
> - virNetDevVPortProfilePtr virtport = NULL;
>
> /* <forward type='bridge|private|vepa|passthrough'> are all
> * VIR_DOMAIN_NET_TYPE_DIRECT.
> @@ -2952,7 +3020,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface)
> netdef->name);
> goto cleanup;
> } else {
> - virNetworkForwardIfDefPtr dev = NULL;
>
> /* pick an interface from the pool */
>
> @@ -3045,14 +3112,16 @@ networkNotifyActualDevice(virDomainNetDefPtr iface)
> struct network_driver *driver = driverState;
> virNetworkObjPtr network;
> virNetworkDefPtr netdef;
> - const char *actualDev;
> + const char *actualDev = NULL;
> + virDomainHostdevDefPtr def = NULL;
> int ret = -1;
>
> if (iface->type != VIR_DOMAIN_NET_TYPE_NETWORK)
> return 0;
>
> if (!iface->data.network.actual ||
> - (virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_DIRECT)) {
> + ((virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_DIRECT) &&
> + (virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_HOSTDEV))) {
> VIR_DEBUG("Nothing to claim from network %s", iface->data.network.name);
> return 0;
> }
> @@ -3067,19 +3136,28 @@ networkNotifyActualDevice(virDomainNetDefPtr iface)
> goto cleanup;
> }
>
> - actualDev = virDomainNetGetActualDirectDev(iface);
> - if (!actualDev) {
> - virReportError(VIR_ERR_INTERNAL_ERROR,
> - "%s", _("the interface uses a direct "
> - "mode, but has no source dev"));
> - goto cleanup;
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + actualDev = virDomainNetGetActualDirectDev(iface);
> + if (!actualDev) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + "%s", _("the interface uses a direct mode, but has no source dev"));
> + goto cleanup;
> + }
> + }
... else ...
> +
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + def = virDomainNetGetActualHostdev(iface);
> + if (!def) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + "%s", _("the interface uses a hostdev mode, but has no hostdev"));
> + goto cleanup;
> + }
> }
>
> netdef = network->def;
> - if (netdef->nForwardIfs == 0) {
> + if ((netdef->nForwardIfs == 0) && (netdef->nForwardPfs == 0)) {
> virReportError(VIR_ERR_INTERNAL_ERROR,
> - _("network '%s' uses a direct mode, but "
> - "has no forward dev and no interface pool"),
> + _("network '%s' uses a direct/hostdev mode, but has no forward dev and no interface pool"),
> netdef->name);
> goto cleanup;
> } else {
> @@ -3087,27 +3165,49 @@ networkNotifyActualDevice(virDomainNetDefPtr iface)
> virNetworkForwardIfDefPtr dev = NULL;
>
> /* find the matching interface in the pool and increment its usageCount */
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> + if (STREQ(actualDev, netdef->forwardIfs[ii].device.dev)) {
> + dev = &netdef->forwardIfs[ii];
> + break;
> + }
> + }
> + }
>
> - for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> - if (STREQ(actualDev, netdef->forwardIfs[ii].device.dev)) {
> - dev = &netdef->forwardIfs[ii];
> - break;
again "... else ..."
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + if ((netdef->nForwardPfs > 0) && (netdef->nForwardIfs <= 0)) {
> + if(networkCreateInterfacePool(netdef) < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Could not create Interface Pool from PF"));
> + goto cleanup;
> + }
> + }
> + for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> + if((def->source.subsys.u.pci.domain == netdef->forwardIfs[ii].device.pci.domain) &&
> + (def->source.subsys.u.pci.bus == netdef->forwardIfs[ii].device.pci.bus) &&
> + (def->source.subsys.u.pci.slot == netdef->forwardIfs[ii].device.pci.slot) &&
> + (def->source.subsys.u.pci.function == netdef->forwardIfs[ii].device.pci.function)) {
Isn't there a virDevicePCIAddressEqual() function?
> + dev = &netdef->forwardIfs[ii];
> + break;
> + }
> }
> }
> +
> /* dev points at the physical device we want to use */
> if (!dev) {
> virReportError(VIR_ERR_INTERNAL_ERROR,
> - _("network '%s' doesn't have dev='%s' in use by domain"),
> - netdef->name, actualDev);
> + _("network '%s' doesn't have dev in use by domain"),
> + netdef->name);
It would probably be very useful to print the netdev name if we were
looking for a netdev, and print out the PCI device info if we were
looking for a PCI device.
> goto cleanup;
> }
>
> - /* PASSTHROUGH mode, and PRIVATE Mode + 802.1Qbh both require
> + /* PASSTHROUGH mode, HOSTDEV mode and PRIVATE Mode + 802.1Qbh both require
> * exclusive access to a device, so current usageCount must be
> * 0 in those cases.
> */
> if ((dev->usageCount > 0) &&
> ((netdef->forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) ||
> + (netdef->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) ||
> ((netdef->forwardType == VIR_NETWORK_FORWARD_PRIVATE) &&
> iface->data.network.actual->data.direct.virtPortProfile &&
> (iface->data.network.actual->data.direct.virtPortProfile->virtPortType
> @@ -3119,8 +3219,18 @@ networkNotifyActualDevice(virDomainNetDefPtr iface)
> }
> /* we are now assured of success, so mark the allocation */
> dev->usageCount++;
> - VIR_DEBUG("Using physical device %s, usageCount %d",
> - dev->device.dev, dev->usageCount);
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + VIR_DEBUG("Using physical device %s, usageCount %d",
> + dev->device.dev, dev->usageCount);
> + }
... else ...
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + VIR_DEBUG("Using physical device with domain=%d bus=%d slot=%d function=%d, usageCount %d",
> + dev->device.pci.domain,
> + dev->device.pci.bus,
> + dev->device.pci.slot,
> + dev->device.pci.function,
> + dev->usageCount);
> + }
> }
>
> ret = 0;
> @@ -3147,14 +3257,16 @@ networkReleaseActualDevice(virDomainNetDefPtr iface)
> struct network_driver *driver = driverState;
> virNetworkObjPtr network = NULL;
> virNetworkDefPtr netdef;
> - const char *actualDev;
> + const char *actualDev = NULL;
> + virDomainHostdevDefPtr def = NULL;
> int ret = -1;
>
> if (iface->type != VIR_DOMAIN_NET_TYPE_NETWORK)
> return 0;
>
> if (!iface->data.network.actual ||
> - (virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_DIRECT)) {
You know, as often as you're calling virDomainNetGetActualType(iface),
maybe you should define a variable
"enum virDomainNetType actualType = virDomainNetGetActualType(iface)" at
the top of the function.
> + ((virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_DIRECT) &&
> + (virDomainNetGetActualType(iface) != VIR_DOMAIN_NET_TYPE_HOSTDEV))) {
> VIR_DEBUG("Nothing to release to network %s", iface->data.network.name);
> ret = 0;
> goto cleanup;
> @@ -3170,44 +3282,79 @@ networkReleaseActualDevice(virDomainNetDefPtr iface)
> goto cleanup;
> }
>
> - actualDev = virDomainNetGetActualDirectDev(iface);
> - if (!actualDev) {
> - virReportError(VIR_ERR_INTERNAL_ERROR,
> - "%s", _("the interface uses a direct "
> - "mode, but has no source dev"));
> - goto cleanup;
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + actualDev = virDomainNetGetActualDirectDev(iface);
> + if (!actualDev) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + "%s", _("the interface uses a direct mode, but has no source dev"));
> + goto cleanup;
> + }
> + }
> +
... else ...
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + def = virDomainNetGetActualHostdev(iface);
> + if (!def) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + "%s", _("the interface uses a hostdev mode, but has no hostdev"));
> + goto cleanup;
> + }
> }
>
> netdef = network->def;
> - if (netdef->nForwardIfs == 0) {
> + if ((netdef->nForwardIfs == 0) && (netdef->nForwardPfs == 0)) {
> virReportError(VIR_ERR_INTERNAL_ERROR,
> - _("network '%s' uses a direct mode, but "
> + _("network '%s' uses a direct/hostdev mode, but "
> "has no forward dev and no interface pool"),
> netdef->name);
> goto cleanup;
> } else {
> int ii;
> virNetworkForwardIfDefPtr dev = NULL;
> -
> - for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> - if (STREQ(actualDev, netdef->forwardIfs[ii].device.dev)) {
> - dev = &netdef->forwardIfs[ii];
> - break;
> +
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> + if (STREQ(actualDev, netdef->forwardIfs[ii].device.dev)) {
> + dev = &netdef->forwardIfs[ii];
> + break;
> + }
> + }
> + }
> +
... else ...
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + for (ii = 0; ii < netdef->nForwardIfs; ii++) {
> + if((def->source.subsys.u.pci.domain == netdef->forwardIfs[ii].device.pci.domain) &&
> + (def->source.subsys.u.pci.bus == netdef->forwardIfs[ii].device.pci.bus) &&
> + (def->source.subsys.u.pci.slot == netdef->forwardIfs[ii].device.pci.slot) &&
> + (def->source.subsys.u.pci.function == netdef->forwardIfs[ii].device.pci.function)) {
Again, some kind of virDevicePCIAddressEqual() function would be nice.
> + dev = &netdef->forwardIfs[ii];
> + break;
> + }
> }
> }
> +
> /* dev points at the physical device we've been using */
> if (!dev) {
> virReportError(VIR_ERR_INTERNAL_ERROR,
> - _("network '%s' doesn't have dev='%s' in use by domain"),
> - netdef->name, actualDev);
> + _("network '%s' doesn't have dev in use by domain"),
> + netdef->name);
> goto cleanup;
> }
>
> dev->usageCount--;
> - VIR_DEBUG("Releasing physical device %s, usageCount %d",
> - dev->device.dev, dev->usageCount);
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_DIRECT) {
> + VIR_DEBUG("Releasing physical device %s, usageCount %d",
> + dev->device.dev, dev->usageCount);
> + }
> + if (virDomainNetGetActualType(iface) == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + VIR_DEBUG("Releasing physical device with domain=%d bus=%d slot=%d function=%d, usageCount %d",
> + dev->device.pci.domain,
> + dev->device.pci.bus,
> + dev->device.pci.slot,
> + dev->device.pci.function,
> + dev->usageCount);
> + }
> }
> -
> +
> ret = 0;
> cleanup:
> if (network)
More information about the libvir-list
mailing list