[libvirt] Re: Supporting vhost-net and macvtap in libvirt for QEMU

Thu Dec 17 09:19:04 UTC 2009

On Wed, Dec 16, 2009 at 07:48:08PM -0600, Anthony Liguori wrote:
> Disclaimer: I am neither an SR-IOV nor a vhost-net expert, but I've CC'd 
> people that are who can throw tomatoes at me for getting bits wrong :-)
> 
> I wanted to start a discussion about supporting vhost-net in libvirt.  
> vhost-net has not yet been merged into qemu but I expect it will be soon 
> so it's a good time to start this discussion.
> 
> There are two modes worth supporting for vhost-net in libvirt.  The 
> first mode is where vhost-net backs to a tun/tap device.  This is 
> behaves in very much the same way that -net tap behaves in qemu today.  
> Basically, the difference is that the virtio backend is in the kernel 
> instead of in qemu so there should be some performance improvement.
> 
> Current, libvirt invokes qemu with -net tap,fd=X where X is an already 
> open fd to a tun/tap device.  I suspect that after we merge vhost-net, 
> libvirt could support vhost-net in this mode by just doing -net 
> vhost,fd=X.  I think the only real question for libvirt is whether to 
> provide a user visible switch to use vhost or to just always use vhost 
> when it's available and it makes sense.  Personally, I think the later 
> makes sense.

I tend to agree, I dont see any compelling  reason to expose 'vhost' as
a config option, since it is not changing any functionality, merely 
the internal impl. I don't think apps would be in any position to decide
whether it should be on, or off. Thus we just need to figure out how to
detect that it is supported in kernel+QEMU, and if supported, enable it.

> The more interesting invocation of vhost-net though is one where the 
> vhost-net device backs directly to a physical network card.  In this 
> mode, vhost should get considerably better performance than the current 
> implementation.  I don't know the syntax yet, but I think it's 
> reasonable to assume that it will look something like -net 
> tap,dev=eth0.   The effect will be that eth0 is dedicated to the guest.

Ok, so in this model you have to create a dedicated ethXX device for
every guest, no sharing ?

> On most modern systems, there is a small number of network devices so 
> this model is not all that useful except when dealing with SR-IOV 
> adapters.  In that case, each physical device can be exposed as many 
> virtual devices (VFs).  There are a few restrictions here though.  The 
> biggest is that currently, you can only change the number of VFs by 
> reloading a kernel module so it's really a parameter that must be set at 
> startup time.

Yes, since the hardware doesn't allow for any usable configurability of
the number of VFs, we'll guest assume that they have already been setup.
Likely the kernel can just enable the max # of VFs at all times.

> I think there are a few ways libvirt could support vhost-net in this 
> second mode.  The simplest would be to introduce a new tag similar to 
> <source network='br0'>.  In fact, if you probed the device type for the 
> network parameter, you could probably do something like <source 
> network='eth0'> and have it Just Work.
> 
> Another model would be to have libvirt see an SR-IOV adapter as a 
> network pool whereas it handled all of the VF management.  Considering 
> how inflexible SR-IOV is today, I'm not sure whether this is the best model.

Agreed, given the hardware limitations I don't see that it is worth the
bother. 

This new mode is not really what we'd call 'bridging' in libvirt network
XML format, so I think we'll want to define a new type of network config
for it in libvirt. Perhaps 

  <network type='physical'>
    <source dev='eth0'/>
  </network>

Or type='passthru'

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|