[libvirt] [PATCHv2 0/9] Let libvirt manage a bridge's FDB

Laine Stump laine at laine.org
Tue Dec 2 17:08:30 UTC 2014


(Everyone - see the request for opinions/ideas towards the bottom)

The idea behind these patches is the following:

1) most virtual machines only have a single MAC address behind each
interface, and that MAC address is known by libvirt.

2) If we (i.e. libvirt) manually add an entry to the bridge's
forwarding database (fdb) for the MAC address associated with a port
on the bridge, we can turn off learning and unicast_flooding for that
port.

3) kernels starting with 3.15 (and actually working correctly starting
in kernel 3.17) will notice that all of a bridge's ports have flood
and learning turned off, and in that case will turn off promiscuous
mode on all ports. If all but one of the ports have flood/learning
turned off, then promiscuous will be turned off on that port (and left
on for all the other ports)

4) When (4) can be done, there is a measurable performance
advantage. It can also *kind of* help security, as it will prevent a
guest from doing anything useful if it changes its MAC address (but
won't prevent the guest from *sending* packets with a spoofed MAC
address).

NB: These only work with a fixed MAC address, and no vlan tags set in
the guest. Support for both of those will be coming.

HERE IS THE REQUEST FOR OPINIONS/IDEAS:

This V2 of the patchset addresses several issues brought up by jferlan
on the original series, and changes the name of the attribute from:

   promiscLinks='yes|no'

to

   fdb='learningWithFlood|managed'

I'm somewhat more happy with this new naming than the previous but
still looking for better ideas. It is closer to describing what the
new code really does, but "learningWithFlood" seems a bit long and
awkward, while I have been told that "fdb" is too short and
unrecognizeable (I will point out that 1) "fdb" is the same name used
by iproute2's "bridge" command, and 2) another bridge option, "stp" is
also a three letter acronym that will only be recognized by those
familiar with configuring an L2 bridge device or watching NASCAR on
Saturday afternoons (or whenever it's on - not a fan myself :-))

Here is a full list of every idea that either I or someone else has
come up with since I started thinking about this:

 promiscLinks='yes|no'

   After initially going with this for the v1 of the patchset, I later
   decided against it, because it doesn't describe what libvirt is
   doing, but only a *possible* side effect on *some* of the ports
   connected to the bridge (in practice, it only happens to the physdev
   port).

 fdb='auto|managed'

   I like "fdb" as the name of the attribute, because I think it really
   gets at what libvirt is doing - it is taking over management of the
   bridge's fdb (Forwarding Database), which ends up providing better
   performance in several ways.

   The problem with this proposal is that the two values are kind of
   ambiguous - it's not clear which one is using the bridge module's
   built-in management (I had figured this would be "auto"), and which
   is telling libvirt to manage it ("managed"). (on the other hand,
   the first option is "ignore the issue, let the underlying system
   handle it", vs. "libvirt should manage it", so maybe it *is* a
   reasonable choice).

   Also, see the comment above about the perceived terseness and obscurity
   of "fdb".

 fdb='learningWithFlood|managed'

   This alternate name was suggested by Michael Tsirkin as a way to
   unambiguously indicate what was being done in the mode where libvirt
   isn't involved in the fdb management. There was some criticism on
   IRC that the name is *too* verbose, especially when contrasted with
   "fdb".

 fdb='learning|managed'

   A suggested shortening of "learningWithFlood".

 forwardingDatabase='blah'

   A way to get around criticism of "fdb". I think this is too verbose,
   but maybe I'm biased :-)

 [specify each minor item that separately]

   In order to manage the fdb by itself, libvirt disables "learning"
   and "unicast_flood" for each tap device, enables "vlan_filtering"
   for the bridge itself, then adds an fdb entry for each tap to the
   bridge. There was one suggestion that, rather than trying to come
   up with a single option that says "do all of these things", we
   should instead make each of them separately configured. The problem
   with this is that it makes it too easy to screw up the
   configuration such that it causes sub-par performance, or simply
   doesn't work at all. Part of libvirt's job is making it difficult
   to screw up (or at least easier to succeed); for example, libvirt's
   virtual networks do a lot of things automatically - create a
   bridge, add iptables rules for filtering and NAT, run an instance
   of dnsmasq - over time we've offered the option of tweaking more
   and more of the details of this setup, but the initial aim was to
   provide something that worked with as few required (or even
   permitted) tweaks as possible.

   I guess what I'm getting at is that I think it would be a mistake
   to require turning on several different knobs (which individually
   make little/no sense) in order to get the bridge into this higher
   performing mode.

So - does anyone have an opinion of any of the options offered above,
or any ideas for alternates?

In the meantime, note that while the default is currently
"learningWithFlood" (meaning that that name is never actually directly
used/required anywhere, but is just in the RNG and the enum
definition), the intent of the people who developed this functionality
in the kernel is that eventually it will work so well that libvirt
management of the fdb can silently become the default with no visible
change in behavior.

NOTE: If you want to actually try out these patches, you'll need to
apply the following patch which I haven't yet pushed:

  https://www.redhat.com/archives/libvir-list/2014-November/msg00948.html

Also, while the description of V1 stated that patches 08 and 09 were
not intended to be pushed yet, due to a problem they caused when
restarting libvirtd after an update, that problem has been solved, so
I now intend to push patches 08 and 09 along with the rest.

Laine Stump (9):
  util: new functions for setting bridge and bridge port attributes
  util: functions to manage bridge fdb (forwarding database)
  conf: new network bridge device attribute fdb
  network: save bridge name in ActualNetDef when actualType==network too
  network: store network fdb setting in NetDef actual object
  network: setup bridge devices for fdb='managed'
  qemu: setup tap devices for fdb='managed'
  qemu: always use virDomainNetGetActualBridgeName to get interface's
    bridge
  lxc: always use virDomainNetGetActualBridgeName to get interface's
    bridge

 docs/formatnetwork.html.in                         |  42 ++-
 docs/schemas/network.rng                           |   9 +
 src/conf/domain_conf.c                             | 129 ++++---
 src/conf/domain_conf.h                             |   2 +
 src/conf/network_conf.c                            |  50 ++-
 src/conf/network_conf.h                            |  11 +
 src/libvirt_private.syms                           |  11 +
 src/lxc/lxc_driver.c                               |  32 +-
 src/lxc/lxc_process.c                              |  32 +-
 src/network/bridge_driver.c                        |  74 ++++
 src/qemu/qemu_command.c                            |  53 ++-
 src/qemu/qemu_hotplug.c                            |  60 +---
 src/util/virnetdevbridge.c                         | 382 ++++++++++++++++++++-
 src/util/virnetdevbridge.h                         |  44 ++-
 tests/networkxml2xmlin/host-bridge-no-flood.xml    |   6 +
 .../nat-network-explicit-flood.xml                 |  21 ++
 tests/networkxml2xmlout/host-bridge-no-flood.xml   |   6 +
 .../nat-network-explicit-flood.xml                 |  23 ++
 tests/networkxml2xmltest.c                         |   2 +
 19 files changed, 778 insertions(+), 211 deletions(-)
 create mode 100644 tests/networkxml2xmlin/host-bridge-no-flood.xml
 create mode 100644 tests/networkxml2xmlin/nat-network-explicit-flood.xml
 create mode 100644 tests/networkxml2xmlout/host-bridge-no-flood.xml
 create mode 100644 tests/networkxml2xmlout/nat-network-explicit-flood.xml

-- 
1.9.3




More information about the libvir-list mailing list