[Libvir] A whole tonne of networking fixes / enhancements
Daniel Veillard
veillard at redhat.com
Tue Mar 13 14:15:04 UTC 2007
On Tue, Mar 13, 2007 at 04:28:16AM +0000, Daniel P. Berrange wrote:
> I've been testing the networking support and found various bugs / missing
> features that I thought we really need to have in the release - some of
> them impact the XML so we have to get this right now.
>
> - Build up in memory linked-list of network devices & disk devices in
> same order as they are listed in XML. Currently they are built up
> in reversed order, which makes the XML not be idempotent, and also
> means that if you have multiple NICs, what you think is eth0 ends
> up being eth4 and what you think is eth4 is eth0. This patch fixes
> the ordering to match XML.
>
> - Set the 'vlan' attribute in command line args to QEMU. This ensures
> that separate network devices are in fact separated. Previously if
> you had multiple NICs, QEMU connected them all to the same VLAN so
> any traffic on one NIC got sent to all NICs. Most definitely not
> what you want in any usual scenario, and created a traffic storm
> from the resultant network loops !
>
> - Added support for networking of type='bridge'. This gives parity
> with equivalent Xen networking, eg
>
> <interface type='bridge'>
> <source dev='xenbr0'/>
> <target dev='vnet3'/>
> </interface>
>
> Will create a tap device called vnet3 and bridge it into xenbr0
>
> - Added support for networking of type='ethernet'. This give parity
> with equivlent Xen networking, eg
>
> <interface type='ethernet'>
> <script path='/etc/qemu-ifup'/>
> <target dev='vnet5'/>
> </interface>
>
> Will create a tap device called vnet5 and run 'qemu-ifup' to setup
> its configuration. Think the various non-bridge Xen networking configs.
>
> - Added support for 'client', 'server', 'mcast' networking types. These
> are QEMU specific types, allowing unprivileged (or privileged) users
> to create virtual networks without TAP devices.
>
> eg two machines, one with
>
> <interface type='server'>
> <source address="127.0.0.1" port="5555"/>
> </interface>
>
> And the other with
>
> <interface type='client'>
> <source address="127.0.0.1" port="5555"/>
> </interface>
>
> Or both using multicast:
>
> <interface type='mcast'>
> <source address="230.0.0.1" port="5558"/>
> </interface>
>
> Both these options allow QEMU instancs on different physical machines
> to talk to each other. The multicast protocol is also compatible with
> the UserModeLinux multicast protocol.
>
> - Fix the 'type=network' config use the <target dev='vnet3'> element
> instead of a custom tapifname=vnet3 attribute - this gives consistent
> way to name tap devices that - most importantly - matches the
> Xen XML format for specifying vifname, eg
>
> <interface type='network'>
> <source network='default'/>
> <target dev='vnet2'/>
> </interface>
>
> Will create a tapdevice called vnet2 and connect it to the bridge
> device associated with the network 'default'.
>
> - Removed references to 'vde' - we're not using this explicitly - at
> some time in the future, we'll perhaps use VDE for doing virtual
> networking for unprivileged users where bridge devices are not
> available
>
> - Removed references to 'tap' network type - this is basically handled
> by the 'ethernet' network type to give XML compatability with the
> same functionality in the Xen backend.
>
> - The virtual network configuration currently always adds whole bunch
> of IPTables rules to the FORWARD/POSTROUTING chain which allow traffic
> from the virtual network to be masqueraded out through any active
> physical interface. This may be correct thing todo for the default
> network, but we also need the ability to create totally isolated
> networks (no forwarding at all), or directed networks (eg forwarding
> to an explicit physical device).
>
> To deal with this scenario I introduce a new element in the network
> XML called '<forward>'. If this is not present, then no forwarding
> rules are added at all. If it is present, but with no attributes
> then a generic rule allowing forwarding to any interface is added.
> If it is present and the 'dev' attribute is specific then forwarding
> is only allowed to that named interface. The default network XML thus
> now includes
>
> <forward/>
>
> So that by default we have a virtual network connected to all physical
> devices.
>
> - MAC addreses were not be autogenerated inside libvirt_qemud. If you
> don't provide a MAC address, QEMU (insanely) uses a hardcoded default.
> So all NICs end up with an identical MAC. We now always autogenerate
> a MAC address if not explicitly listed in XML.
>
>
> One final thing to be aware of - the Fedora Core 6 Xen kernel currently has
> totally fubar TCP checksum offland. So if you try to bridge a Xen guest into
> the libvirt virtual networking, it'll fail to get a DHCP address from dnsmasq.
> Even if you fix that by turning off TX checksums in Dom0, you'll get checksum
> failures for the actual TCP data transmission too. The only solution is to
> either upgrade the Dom0 kernel to a RHEL-5 vintage, or to also turn off
> checksumming in the guest. A new FC6 xen kernel is in the works which should
> hopefully fix this for real.
>
> With fixed kernel, I can easily setup virtual networks connecting both
> Xen PV, Xen FV and QEMU instances together.
>
> Anyway, the upshot of all this, is that we can now trivially create really
> complicated & fun networking layouts across QEMU & Xen, using a mixture of
> bridging, NAT, isolated LANs, and tunnelled VLANS :-) We really ought to
> document it a little though
Yup, but in general libvirt documentation really need some revamp, I guess
that will be the main task for me post to the upcoming release.
> Since Mark is off on vacation for a while, I'd appreciate people taking a
> close look at this / actually giving it a try if you can.
>
> It is possible to create a totally isolated network using
>
> <network>
> <name>private</name>
> <uuid>d237ce44-8efa-452c-b8e6-1ae9cf53aeb1</uuid>
> <bridge name="virbr0" />
> <ip address="192.168.122.1" netmask="255.255.255.0">
> <dhcp>
> <range start="192.168.122.2" end="192.168.122.254" />
> </dhcp>
> </ip>
> </network>
>
> And a QEMU guest with 5 (yes, 5) network cards
>
> <domain type='qemu'>
> <name>QEMUFirewall</name>
> <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
> <memory>219200</memory>
> <currentMemory>219200</currentMemory>
> <vcpu>1</vcpu>
> <os>
> <type arch='i686' machine='pc'>hvm</type>
> <boot dev='hd'/>
> </os>
> <devices>
> <emulator>/usr/bin/qemu</emulator>
> <disk type='block' device='disk'>
> <source dev='/dev/HostVG/QEMUGuest1'/>
> <target dev='hda'/>
> </disk>
> <interface type='network'>
> <source network='private'/>
> <target dev='vnet1'/>
> </interface>
> <interface type='bridge'>
> <source dev='xenbr0'/>
> <target dev='vnet2'/>
> </interface>
> <interface type='ethernet'>
> <script path='/etc/dan-test-ifup'/>
> <target dev='vnet3'/>
> </interface>
> <interface type='server'>
> <source address="127.0.0.1" port="5555"/>
> </interface>
> <interface type='mcast'>
> <source address="230.0.0.1" port="5558"/>
> </interface>
> <graphics type='vnc' port='-1'/>
> </devices>
> </domain>
>
>
> In this XML, only eth1 is connected to the hosts public facing network.
> The other NICs are all on various private networks. So this QEMU guest
> is in essence a router/firewall box. eg, you could connected various
> other guests to the 'private' virtual network, and the only way they
> could reach the outside world is via this QEMU instance doing routing.
This all sounds good, I really appreciate trying to unify as much as
possible the various syntaxes. I reviewed the patch and really didn't found
anything to point out. I don't have yet an up2date rawhide system to test
unfortunately.
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard at redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
More information about the libvir-list
mailing list