[libvirt] Re: Supporting vhost-net and macvtap in libvirt for QEMU

Thu Dec 17 18:00:15 UTC 2009

On Thursday 17 December 2009, Anthony Liguori wrote:

> There are two modes worth supporting for vhost-net in libvirt.  The 
> first mode is where vhost-net backs to a tun/tap device.  This is 
> behaves in very much the same way that -net tap behaves in qemu today.  
> Basically, the difference is that the virtio backend is in the kernel 
> instead of in qemu so there should be some performance improvement.
> 
> Current, libvirt invokes qemu with -net tap,fd=X where X is an already 
> open fd to a tun/tap device.  I suspect that after we merge vhost-net, 
> libvirt could support vhost-net in this mode by just doing -net 
> vhost,fd=X.  I think the only real question for libvirt is whether to 
> provide a user visible switch to use vhost or to just always use vhost 
> when it's available and it makes sense.  Personally, I think the later 
> makes sense.

I think it should be treated like any other option where we have kernel
support to make something "go fast", e.g. kvm, or the in-kernel interrupt
processing. If we don't enable it by default when it's there, I would
prefer to have an '--enable-vhost' option to replacing the '-net tap'
option with '-net vhost', because that would be easier to integrate
with existing scripts.

> The more interesting invocation of vhost-net though is one where the 
> vhost-net device backs directly to a physical network card.  In this 
> mode, vhost should get considerably better performance than the current 
> implementation.  I don't know the syntax yet, but I think it's 
> reasonable to assume that it will look something like -net 
> tap,dev=eth0.   The effect will be that eth0 is dedicated to the guest.
> 
> On most modern systems, there is a small number of network devices so 
> this model is not all that useful except when dealing with SR-IOV 
> adapters.  In that case, each physical device can be exposed as many 
> virtual devices (VFs).  There are a few restrictions here though.  The 
> biggest is that currently, you can only change the number of VFs by 
> reloading a kernel module so it's really a parameter that must be set at 
> startup time.

I like to think of this way of using SR-IOV (VMDq actually) as a way to
do macvlan with hardware support. Unfortunately, it does work like
this yet, but the way I would like to do it is:

* use 'ip link ... type macvlan' as the configuration frontend for this mode
* as long as there are VFs, PFs or queue pairs available in hardware, use them
* When you run out of PCI functions, register additional unicast MAC addresses
  with the hardware, as macvlan does today
* As the final fallback, when we run out of unicast MAC addresses in the
   NIC, put it into promiscuous mode. Again, macvlan handles this fine today.

Right now, if you want to use a VF, you have to set up either raw socket,
because you can't add a tun/tap device to the interface without a bridge,
which would defeat the whole purpose of doing this.

Macvtap should eventually solve this, but only after VMDq is integrated
with macvlan.

> I think there are a few ways libvirt could support vhost-net in this 
> second mode.  The simplest would be to introduce a new tag similar to 
> <source network='br0'>.  In fact, if you probed the device type for the 
> network parameter, you could probably do something like <source 
> network='eth0'> and have it Just Work.

Right. The first option (source network='br0) is not so ideal because it
assumes that you run a bridge, which you typically don't want in this
mode, because those devices have the bridge in hardware (or in
macvlan for the software case)

	Arnd