[libvirt] [RFC][PATCH v2 0/2] Dynamic backend setup for macvtap interfaces

Tue Mar 9 13:27:46 UTC 2010

libvir-list-bounces at redhat.com wrote on 03/08/2010 07:29:42 PM:

> Using a bridge to connect a qemu NIC to a host interface offers a fair
> amount of flexibility to reconfigure the host without restarting the VM.
> For example, if the bridge connects host interface eth0 to the qemu tap0
> interface, eth0 can be hot-removed and hot-plugged without affecting the
> VM.  Similarly, if the bridge connects host VLAN interface vlan0 to the
> qemu tap0 interface, the admin can easily replace vlan0 with vlan1
> without the VM noticing.
> 
> Using the macvtap driver instead of a kernel bridge, the host interface
> is much more tightly tied to the VM.  Qemu communicates with the macvtap
> interface through a file descriptor, and the macvtap interface is bound
> permanently to a specific host interface when it is first created.
> What's more, if the underlying host interface disappears, the macvtap
> interface vanishes along with it, leaving the VM holding a file
> descriptor for a deleted file.
> 
> To avoid race conditions during system startup, I would like libvirt to

The problem I guess is that the underlying interface can disappear at any 
time due to hotplug leaving you with a race condition at other times as 
well until a watcher thread detects the change and can act, no?

> allow starting up the VM with a NIC even if the underlying host
> interface doesn't yet exist, deferring creation of the macvtap interface
> (analogous to starting up the VM with a tap interface bound to an orphan
> bridge).  To support adding and removing a host interface without

What do you pass to Qemu command line? Currently we pass a file descriptor 
of the tap interface...

> restarting the VM, I would like libvirt to react to the (re)appearance
> of the underlying host interface, creating a new macvtap interface and
> passing the new fd to qemu to reconnect to the NIC.

How do you handle the macvtap description in the VM configuration? 
Currently we have the 'direct' interface description where the link device 
is written into the domain description:

 <devices>
    <interface type='direct'/>
    ...
    <interface type='direct'>
      <source dev='eth0' mode='vepa'/>
    </interface>
  </devices>

http://libvirt.org/formatdomain.html#elementsNICSDirect

Are you reflecting the change to 'eth0' also in case the hotplugged device 
was to appear under another name?

> 
> (It would also be nice if libvirt allowed the user to change which
> underlying host interface the qemu NIC is connected to.  I'm ignoring
> this issue for now, except to note that implementing the above features
> should make this easier.)
> 
> The libvirt API already supports domainAttachDevice and
> domainDetachDevice to add or remove an interface while the VM is
> running.  In the qemu implementation, these commands add or remove the
> VM NIC device as well as reconfiguring the host side.  This works only
> if the OS and application running in the VM can handle PCI hotplug and
> dynamically reconfigure its network.  I would like to isolate the VM
> from changes to the host network setup, whether you use macvtap or a
> bridge.

So currently you would have to attach and detach the direct device. Now in 
your implementation a host unplug would automatically cause the macvtap to 
get unplugged and if a host interface appears it would automatically 
recreate a macvtap linking it to this host interface? Under what 
conditions does this work? Does the new interface have to have the same 
name? I wondering, because some scripts I believe check for the MAC 
address of the device and if it doesn't match the one expected for eth0, 
it may appear as eth1. How are cases handled where I would like it to 
reconnect to vlan 100 of the newly connected host interface but I probably 
have to run some command to first create that vlan 100 interface?

> 
> The changes I think are needed to implement this include:
> 
> 1. Refactor qemudDomainAttachNetDevice/qemudDomainDetachNetDevice, which
> currently handle both backend (host) setup and adding/removing the VM
> NIC device; move the backend setup code into separate functions that can
> called separately without affecting VM devices.
> 
> 2. Implement a thread or task that watches for changes to the underlying
> host interface for each configured macvtap interface, and reacts by
> invoking the appropriate backend setup code.

I suppose the backend setup code is provided and not some external script 
that the user can run to for example have the vlan 100 interface created 
on host hotplug.

   Stefan

> 
> 3. Change qemudBuildCommandLine to defer backend setup if qemu supports
> the necessary features for doing it later (e.g. the host_net_add monitor
> command).
> 
> 4. Implement appropriate error handling and reporting, and any necessary
> changes to the configuration schema.
> 
> The following patches are a partial implementation of the above as a
> proof of concept.
> 
> Patch 1 implements change (1) above, moving the backend setup code to
> new functions
> qemudDomainConnectNetBackend/qemudDomainDisconnectNetBackend, and
> calling these functions from the existing
> qemudDomainAttachNetDevice/qemudDomainDetachNetDevice.  I think this
> change is useful on its own: it breaks up two monster functions into
> more manageable pieces, and eliminates some code duplication (e.g. the
> try_remove clause at the end of qemudDomainAttachNetDevice).
> 
> Patch 2 is a godawful hack roughly implementing changes (2) and (3)
> above (did I mention that this is a proof of concept?).  It spawns a
> thread that simply tries reconnecting the backend of each macvtap
> interface once a second.  As long as the interface is already up, the
> reconnection fails. If the macvtap interface goes away because the
> underlying host interface disappears, the reconnection fails until the
> host interface reappears.
> 
> I ran into two major issues while implementing (2) and (3):
> 
> - Can we use the existing virEvent functions to invoke the reconnection
> process, triggered either by a timer or by an event from the host?  It
> seems like this ought to work, but it appears that communication between
> libvirt and the qemu monitor relies on an event, and since all events
> run in the same thread, there's no way for an event to call the monitor.
> 
> - Should the reconnection process use udev or hal to get notifications,
> or leverage the node device code which itself uses udev or hal?
> Currently there doesn't appear to be a way to get notifications of
> changes to node devices; if there were, we'd still need to address the
> threading issue.  If we use node devices, what changes to the
> configuration schema would be needed to associate a macvtap interface
> with the underlying node device?
> 
> I'd appreciate input on item (4) as well (e.g. does it always make sense
> to ignore the missing host interface on the assumption that it could
> show up later?).
> 
> --Ed
> 
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20100309/93e1262f/attachment-0001.htm>