[libvirt] Re: Supporting vhost-net and macvtap in libvirt for QEMU
Chris Wright
chrisw at redhat.com
Thu Dec 17 21:16:48 UTC 2009
* Anthony Liguori (aliguori at linux.vnet.ibm.com) wrote:
> There are two modes worth supporting for vhost-net in libvirt. The
> first mode is where vhost-net backs to a tun/tap device. This is
> behaves in very much the same way that -net tap behaves in qemu
> today. Basically, the difference is that the virtio backend is in
> the kernel instead of in qemu so there should be some performance
> improvement.
>
> Current, libvirt invokes qemu with -net tap,fd=X where X is an
> already open fd to a tun/tap device. I suspect that after we merge
> vhost-net, libvirt could support vhost-net in this mode by just
> doing -net vhost,fd=X. I think the only real question for libvirt
> is whether to provide a user visible switch to use vhost or to just
> always use vhost when it's available and it makes sense.
> Personally, I think the later makes sense.
Doesn't sound useful. Low-level, sure worth being able to turn things
on and off for testing/debugging, but probably not something a user
should be burdened with in libvirt.
But I dont' understand your -net vhost,fd=X, that would still be -net
tap=fd=X, no? IOW, vhost is an internal qemu impl. detail of the virtio
backend (or if you get your wish, $nic_backend).
> The more interesting invocation of vhost-net though is one where the
> vhost-net device backs directly to a physical network card. In this
> mode, vhost should get considerably better performance than the
> current implementation. I don't know the syntax yet, but I think
> it's reasonable to assume that it will look something like -net
> tap,dev=eth0. The effect will be that eth0 is dedicated to the
> guest.
tap? we'd want either macvtap or raw socket here.
> On most modern systems, there is a small number of network devices
> so this model is not all that useful except when dealing with SR-IOV
> adapters. In that case, each physical device can be exposed as many
> virtual devices (VFs). There are a few restrictions here though.
> The biggest is that currently, you can only change the number of VFs
> by reloading a kernel module so it's really a parameter that must be
> set at startup time.
>
> I think there are a few ways libvirt could support vhost-net in this
> second mode. The simplest would be to introduce a new tag similar
> to <source network='br0'>. In fact, if you probed the device type
> for the network parameter, you could probably do something like
> <source network='eth0'> and have it Just Work.
We'll need to keep track of more than just the other en
We need to 0
> Another model would be to have libvirt see an SR-IOV adapter as a
> network pool whereas it handled all of the VF management.
> Considering how inflexible SR-IOV is today, I'm not sure whether
> this is the best model.
We already need to know the VF<->PF relationship. For example, don't
want to assign a VF to a guest, then a PF to another guest for basic
sanity reasons. As we get better ability to manage the embedded switch
in an SR-IOV NIC we will need to manage them as well. So we do need
to have some concept of managing an SR-IOV adapter.
So I think we want to maintain a concept of the qemu backend (virtio,
e1000, etc), the fd that connects the qemu backend to the host (tap,
socket, macvtap, etc), and the bridge. The bridge bit gets a little
complicated. We have the following bridge cases:
- sw bridge
- normal existing setup, w/ Linux bridging code
- macvlan
- hw bridge
- on SR-IOV card
- configured to simply fwd to external hw bridge (like VEPA mode)
- configured as a bridge w/ policies (QoS, ACL, port mirroring,
etc. and allows inter-guest traffic and looks a bit like above
sw switch)
- external
- need to possibly inform switch of incoming vport
And, we can have a hybrid. E.g., no reason one VF can't be shared by a
few guests.
More information about the libvir-list
mailing list