[libvirt] [PATCHv2 0/9] Let libvirt manage a bridge's FDB

John Ferlan jferlan at redhat.com
Tue Dec 2 20:57:00 UTC 2014



On 12/02/2014 12:08 PM, Laine Stump wrote:
> (Everyone - see the request for opinions/ideas towards the bottom)

Time wise - I'll take the detour through .0 now in order to provide some
feedback since I've now commented on .3 where the names are defined..

> 
> The idea behind these patches is the following:
> 
> 1) most virtual machines only have a single MAC address behind each
> interface, and that MAC address is known by libvirt.
> 
> 2) If we (i.e. libvirt) manually add an entry to the bridge's
> forwarding database (fdb) for the MAC address associated with a port
> on the bridge, we can turn off learning and unicast_flooding for that
> port.
> 
> 3) kernels starting with 3.15 (and actually working correctly starting
> in kernel 3.17) will notice that all of a bridge's ports have flood
> and learning turned off, and in that case will turn off promiscuous
> mode on all ports. If all but one of the ports have flood/learning
> turned off, then promiscuous will be turned off on that port (and left
> on for all the other ports)
> 
> 4) When (4) can be done, there is a measurable performance
> advantage. It can also *kind of* help security, as it will prevent a
> guest from doing anything useful if it changes its MAC address (but
> won't prevent the guest from *sending* packets with a spoofed MAC
> address).
> 
> NB: These only work with a fixed MAC address, and no vlan tags set in
> the guest. Support for both of those will be coming.
> 
> HERE IS THE REQUEST FOR OPINIONS/IDEAS:
> 
> This V2 of the patchset addresses several issues brought up by jferlan
> on the original series, and changes the name of the attribute from:
> 
>    promiscLinks='yes|no'
> 
> to
> 
>    fdb='learningWithFlood|managed'
> 
> I'm somewhat more happy with this new naming than the previous but
> still looking for better ideas. It is closer to describing what the
> new code really does, but "learningWithFlood" seems a bit long and
> awkward, while I have been told that "fdb" is too short and
> unrecognizeable (I will point out that 1) "fdb" is the same name used
> by iproute2's "bridge" command, and 2) another bridge option, "stp" is
> also a three letter acronym that will only be recognized by those
> familiar with configuring an L2 bridge device or watching NASCAR on
> Saturday afternoons (or whenever it's on - not a fan myself :-))

A google search found the following with an explanation of fdb:

http://blog.michaelfmcnamara.com/2008/02/what-are-the-arp-and-fdb-tables/

Since "fdb" is just another TLA used to describe a feature, perhaps
keeping it as the name is just as good as 'vlan' or 'vepa' FLA's.
Someone that knows what it does may appreciate the name matches rather
than having to figure out what some other named attribute does or (gasp)
actually read the friendly manual...

I suppose the one issue with managed is what happens in 1 year from now
when some new option is desired (although with L2 we are somewhat
limited).

> 
> Here is a full list of every idea that either I or someone else has
> come up with since I started thinking about this:
> 
>  promiscLinks='yes|no'
> 
>    After initially going with this for the v1 of the patchset, I later
>    decided against it, because it doesn't describe what libvirt is
>    doing, but only a *possible* side effect on *some* of the ports
>    connected to the bridge (in practice, it only happens to the physdev
>    port).
> 
>  fdb='auto|managed'
> 
>    I like "fdb" as the name of the attribute, because I think it really
>    gets at what libvirt is doing - it is taking over management of the
>    bridge's fdb (Forwarding Database), which ends up providing better
>    performance in several ways.
> 
>    The problem with this proposal is that the two values are kind of
>    ambiguous - it's not clear which one is using the bridge module's
>    built-in management (I had figured this would be "auto"), and which
>    is telling libvirt to manage it ("managed"). (on the other hand,
>    the first option is "ignore the issue, let the underlying system
>    handle it", vs. "libvirt should manage it", so maybe it *is* a
>    reasonable choice).
> 
>    Also, see the comment above about the perceived terseness and obscurity
>    of "fdb".
> 
>  fdb='learningWithFlood|managed'
> 
>    This alternate name was suggested by Michael Tsirkin as a way to
>    unambiguously indicate what was being done in the mode where libvirt
>    isn't involved in the fdb management. There was some criticism on
>    IRC that the name is *too* verbose, especially when contrasted with
>    "fdb".
> 
>  fdb='learning|managed'
> 
>    A suggested shortening of "learningWithFlood".
> 

For 'learningWithFlood' and 'learning' - I'm not quite sure it
completely conveys that we're not doing anything other than allowing
flooding to occur.  We're not learning which MAC's to not flood, so
we're allowing the flood.

Going *just* by the name, I'd partially assume that we're learning which
MAC's we don't want to flood all the ports. I see learningWithFlood as
an option to describe the feature you're trying to add; whereas, managed
could be considered too broad.

Thus, the option seems to be more "fdb='none|learningWithFlood', where
it's described that 'learningWithFlood' will do all the things necessary
to learn MAC's so as to avoid flooding each of the ports. It can also be
described that "vlan_filtering" will be enabled because ???...
Describing the promiscuous setting side effect is not easy either and it
seems while in general it could be seen as a "feature" to disable
promiscuous, there is some unintended side effect on some networks usage
of vlan tagging & multicast.

The problem with using "fdb='foo'" is that the future could desire not
only foo being set, but bar also - so how does that work
"fdb='foo'|'bar'|'foobar'" ?? (e.g., just foo, just bar, or both).
Unless you want to pull apart some list ("fdb='foo,bar'") - too much
work IMO.

Since using the fdb is an implementation detail of whatever attribute
name is chosen now, perhaps this feature could be described using
"filterMAC='yes|no'". This leaves it open to add other features in the
future that could use fdb to manage something, but it doesn't describe
the vlan and promiscuous side effect all that well.

Side question - does "vlan_filtering" have any use outside of when this
flooding option is chosen?  If so, then having a "filterVLAN='yes|no'"
makes sense. As it relates to the current feature, if filterVLAN was not
defined and filterMAC='yes', then the libvirt will "quietly" set
filterVLAN='yes'. Documenting that setting "filterVLAN='no'" and
"filterMAC='yes' causes isses - allows the user to decide if they want
those issues. Yes, it gets messy.

This still leaves the promiscuous issue, which I'm not sure how you
could solve since it seems as though by setting the fdb to learn MAC's
could result in promiscuous being disabled which interferes with some
other features' capability.

John

>  forwardingDatabase='blah'
> 
>    A way to get around criticism of "fdb". I think this is too verbose,
>    but maybe I'm biased :-)
> 
>  [specify each minor item that separately]
> 
>    In order to manage the fdb by itself, libvirt disables "learning"
>    and "unicast_flood" for each tap device, enables "vlan_filtering"
>    for the bridge itself, then adds an fdb entry for each tap to the
>    bridge. There was one suggestion that, rather than trying to come
>    up with a single option that says "do all of these things", we
>    should instead make each of them separately configured. The problem
>    with this is that it makes it too easy to screw up the
>    configuration such that it causes sub-par performance, or simply
>    doesn't work at all. Part of libvirt's job is making it difficult
>    to screw up (or at least easier to succeed); for example, libvirt's
>    virtual networks do a lot of things automatically - create a
>    bridge, add iptables rules for filtering and NAT, run an instance
>    of dnsmasq - over time we've offered the option of tweaking more
>    and more of the details of this setup, but the initial aim was to
>    provide something that worked with as few required (or even
>    permitted) tweaks as possible.
> 
>    I guess what I'm getting at is that I think it would be a mistake
>    to require turning on several different knobs (which individually
>    make little/no sense) in order to get the bridge into this higher
>    performing mode.
> 
> So - does anyone have an opinion of any of the options offered above,
> or any ideas for alternates?
> 
> In the meantime, note that while the default is currently
> "learningWithFlood" (meaning that that name is never actually directly
> used/required anywhere, but is just in the RNG and the enum
> definition), the intent of the people who developed this functionality
> in the kernel is that eventually it will work so well that libvirt
> management of the fdb can silently become the default with no visible
> change in behavior.
> 
> NOTE: If you want to actually try out these patches, you'll need to
> apply the following patch which I haven't yet pushed:
> 
>   https://www.redhat.com/archives/libvir-list/2014-November/msg00948.html
> 
> Also, while the description of V1 stated that patches 08 and 09 were
> not intended to be pushed yet, due to a problem they caused when
> restarting libvirtd after an update, that problem has been solved, so
> I now intend to push patches 08 and 09 along with the rest.
> 
> Laine Stump (9):
>   util: new functions for setting bridge and bridge port attributes
>   util: functions to manage bridge fdb (forwarding database)
>   conf: new network bridge device attribute fdb
>   network: save bridge name in ActualNetDef when actualType==network too
>   network: store network fdb setting in NetDef actual object
>   network: setup bridge devices for fdb='managed'
>   qemu: setup tap devices for fdb='managed'
>   qemu: always use virDomainNetGetActualBridgeName to get interface's
>     bridge
>   lxc: always use virDomainNetGetActualBridgeName to get interface's
>     bridge
> 
>  docs/formatnetwork.html.in                         |  42 ++-
>  docs/schemas/network.rng                           |   9 +
>  src/conf/domain_conf.c                             | 129 ++++---
>  src/conf/domain_conf.h                             |   2 +
>  src/conf/network_conf.c                            |  50 ++-
>  src/conf/network_conf.h                            |  11 +
>  src/libvirt_private.syms                           |  11 +
>  src/lxc/lxc_driver.c                               |  32 +-
>  src/lxc/lxc_process.c                              |  32 +-
>  src/network/bridge_driver.c                        |  74 ++++
>  src/qemu/qemu_command.c                            |  53 ++-
>  src/qemu/qemu_hotplug.c                            |  60 +---
>  src/util/virnetdevbridge.c                         | 382 ++++++++++++++++++++-
>  src/util/virnetdevbridge.h                         |  44 ++-
>  tests/networkxml2xmlin/host-bridge-no-flood.xml    |   6 +
>  .../nat-network-explicit-flood.xml                 |  21 ++
>  tests/networkxml2xmlout/host-bridge-no-flood.xml   |   6 +
>  .../nat-network-explicit-flood.xml                 |  23 ++
>  tests/networkxml2xmltest.c                         |   2 +
>  19 files changed, 778 insertions(+), 211 deletions(-)
>  create mode 100644 tests/networkxml2xmlin/host-bridge-no-flood.xml
>  create mode 100644 tests/networkxml2xmlin/nat-network-explicit-flood.xml
>  create mode 100644 tests/networkxml2xmlout/host-bridge-no-flood.xml
>  create mode 100644 tests/networkxml2xmlout/nat-network-explicit-flood.xml
> 




More information about the libvir-list mailing list