[Libvir] A whole tonne of networking fixes / enhancements

Daniel Veillard veillard at redhat.com
Tue Mar 13 14:15:04 UTC 2007


On Tue, Mar 13, 2007 at 04:28:16AM +0000, Daniel P. Berrange wrote:
> I've been testing the networking support and found various bugs / missing
> features that I thought we really need to have in the release - some of
> them impact the XML so we have to get this right now.
> 
>  - Build up in memory linked-list of network devices & disk devices in
>    same order as they are listed in XML. Currently they are built up
>    in reversed order, which makes the XML not be idempotent, and also
>    means that if you have multiple NICs, what you think is eth0 ends
>    up being eth4 and what you think is eth4 is eth0. This patch fixes
>    the ordering to match XML.
> 
>  - Set the 'vlan' attribute in command line args to QEMU. This ensures
>    that separate network devices are in fact separated. Previously if
>    you had multiple NICs, QEMU connected them all to the same VLAN so
>    any traffic on one NIC got sent to all NICs. Most definitely not
>    what you want in any usual scenario, and created a traffic storm
>    from the resultant network loops !
> 
>  - Added support for networking of  type='bridge'. This gives parity
>    with equivalent Xen networking, eg
> 
>       <interface type='bridge'>
>         <source dev='xenbr0'/>
>         <target dev='vnet3'/>
>       </interface>
> 
>    Will create a tap device called vnet3 and bridge it into xenbr0
> 
>  - Added support for networking of type='ethernet'. This give parity
>    with equivlent Xen networking, eg
> 
>      <interface type='ethernet'>
>        <script path='/etc/qemu-ifup'/>
>        <target dev='vnet5'/>
>      </interface>
> 
>    Will create a tap device called vnet5 and run 'qemu-ifup' to setup
>    its configuration. Think the various non-bridge Xen networking configs.
> 
>  - Added support for 'client', 'server', 'mcast' networking types. These
>    are QEMU specific types, allowing unprivileged (or privileged) users
>    to create virtual networks without TAP devices.
> 
>    eg two machines, one with
> 
>       <interface type='server'>
>         <source address="127.0.0.1" port="5555"/>
>       </interface>
> 
>    And the other with
> 
>       <interface type='client'>
>         <source address="127.0.0.1" port="5555"/>
>       </interface>
> 
>    Or both using multicast:
> 
>       <interface type='mcast'>
>         <source address="230.0.0.1" port="5558"/>
>       </interface>
> 
>    Both these options allow QEMU instancs on different physical machines
>    to talk to each other. The multicast protocol is also compatible with
>    the UserModeLinux multicast protocol.
> 
>  - Fix the 'type=network' config use the <target dev='vnet3'> element
>    instead of a custom  tapifname=vnet3 attribute - this gives consistent
>    way to name tap devices that - most importantly -  matches the 
>    Xen XML format for specifying vifname, eg
> 
>        <interface type='network'>
>          <source network='default'/>
>          <target dev='vnet2'/>
>        </interface>
> 
>    Will create a tapdevice called vnet2 and connect it to the bridge
>    device associated with the network 'default'.
> 
>  - Removed references to 'vde' - we're not using this explicitly - at
>    some time in the future, we'll perhaps use VDE for doing virtual
>    networking for unprivileged users where bridge devices are not
>    available
> 
>  - Removed references to 'tap' network type - this is basically handled
>    by the 'ethernet' network type to give XML compatability with the
>    same functionality in the Xen backend.
> 
>  - The virtual network configuration currently always adds whole bunch
>    of IPTables rules to the FORWARD/POSTROUTING chain which allow traffic
>    from the virtual network to be masqueraded out through any active
>    physical interface.  This may be correct thing todo for the default
>    network, but we also need the ability to create totally isolated
>    networks (no forwarding at all), or directed networks (eg forwarding
>    to an explicit physical device). 
> 
>    To deal with this scenario I introduce a new element in the network
>    XML called '<forward>'. If this is not present, then no forwarding
>    rules are added at all. If it is present, but with no attributes
>    then a generic rule allowing forwarding to any interface is added.
>    If it is present and the 'dev' attribute is specific then forwarding
>    is only allowed to that named interface. The default network XML thus
>    now includes
> 
>         <forward/>
> 
>    So that by default we have a virtual network connected to all physical
>    devices.
> 
>  - MAC addreses were not be autogenerated inside libvirt_qemud. If you
>    don't provide a MAC address, QEMU (insanely) uses a hardcoded default.
>    So all NICs end up with an identical MAC. We now always autogenerate
>    a MAC address if not explicitly listed in XML.
> 
> 
> One final thing to be aware of - the Fedora Core 6 Xen kernel currently has
> totally fubar TCP checksum offland. So if you try to bridge a Xen guest into
> the libvirt virtual networking, it'll fail to get a DHCP address from dnsmasq.
> Even if you fix that by turning off TX checksums in Dom0, you'll get checksum
> failures for the actual TCP data transmission too. The only solution is to
> either upgrade the Dom0 kernel to a RHEL-5 vintage, or to also turn off
> checksumming in the guest. A new FC6 xen kernel is in the works which should
> hopefully fix this for real.
> 
> With fixed kernel, I can easily setup virtual networks connecting both
> Xen PV, Xen FV and QEMU instances together.
> 
> Anyway, the upshot of all this, is that we can now trivially create really
> complicated & fun networking layouts across QEMU & Xen, using a mixture of
> bridging, NAT, isolated LANs, and tunnelled VLANS :-)  We really ought to
> document it a little though

  Yup, but in general libvirt documentation really need some revamp, I guess
that will be the main task for me post to the upcoming release.

> Since Mark is off on vacation for a while, I'd appreciate people taking a
> close look at this / actually giving it a try if you can.
> 
> It is possible to create a totally isolated network using
> 
>   <network>
>     <name>private</name>
>     <uuid>d237ce44-8efa-452c-b8e6-1ae9cf53aeb1</uuid>
>     <bridge name="virbr0" />
>     <ip address="192.168.122.1" netmask="255.255.255.0">
>       <dhcp>
>         <range start="192.168.122.2" end="192.168.122.254" />
>       </dhcp>
>     </ip>
>   </network>
> 
> And a QEMU guest with 5 (yes, 5)  network cards
> 
> <domain type='qemu'>
>   <name>QEMUFirewall</name>
>   <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
>   <memory>219200</memory>
>   <currentMemory>219200</currentMemory>
>   <vcpu>1</vcpu>
>   <os>
>     <type arch='i686' machine='pc'>hvm</type>
>     <boot dev='hd'/>
>   </os>
>   <devices>
>     <emulator>/usr/bin/qemu</emulator>
>     <disk type='block' device='disk'>
>       <source dev='/dev/HostVG/QEMUGuest1'/>
>       <target dev='hda'/>
>     </disk>
>     <interface type='network'>
>       <source network='private'/>
>       <target dev='vnet1'/>
>     </interface>
>     <interface type='bridge'>
>       <source dev='xenbr0'/>
>       <target dev='vnet2'/>
>     </interface>
>     <interface type='ethernet'>
>       <script path='/etc/dan-test-ifup'/>
>       <target dev='vnet3'/>
>     </interface>
>     <interface type='server'>
>       <source address="127.0.0.1" port="5555"/>
>     </interface>
>     <interface type='mcast'>
>       <source address="230.0.0.1" port="5558"/>
>     </interface>
>     <graphics type='vnc' port='-1'/>
>   </devices>
> </domain>
> 
> 
> In this XML, only eth1  is connected to the hosts public facing network.
> The other NICs are all on various private networks. So this QEMU guest
> is in essence a router/firewall box. eg, you could connected various
> other guests to the 'private' virtual network, and the only way they
> could reach the outside world is via this QEMU instance doing routing.

  This all sounds good, I really appreciate trying to unify as much as
possible the various syntaxes. I reviewed the patch and really didn't found 
anything to point out. I don't have yet an up2date rawhide system to test
unfortunately.

Daniel



-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard at redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/




More information about the libvir-list mailing list