[libvirt] [PATCHv2 0/9] Let libvirt manage a bridge's FDB

Daniel P. Berrange berrange at redhat.com
Thu Dec 4 09:56:05 UTC 2014


On Thu, Dec 04, 2014 at 03:56:53AM -0500, Laine Stump wrote:
> On 12/03/2014 12:02 PM, Daniel P. Berrange wrote:
> > On Tue, Dec 02, 2014 at 12:08:30PM -0500, Laine Stump wrote:
> >> (Everyone - see the request for opinions/ideas towards the bottom)
> >>
> >> The idea behind these patches is the following:
> >>
> >> 1) most virtual machines only have a single MAC address behind each
> >> interface, and that MAC address is known by libvirt.
> >>
> >> 2) If we (i.e. libvirt) manually add an entry to the bridge's
> >> forwarding database (fdb) for the MAC address associated with a port
> >> on the bridge, we can turn off learning and unicast_flooding for that
> >> port.
> >>
> >> 3) kernels starting with 3.15 (and actually working correctly starting
> >> in kernel 3.17) will notice that all of a bridge's ports have flood
> >> and learning turned off, and in that case will turn off promiscuous
> >> mode on all ports. If all but one of the ports have flood/learning
> >> turned off, then promiscuous will be turned off on that port (and left
> >> on for all the other ports)
> >>
> >> 4) When (4) can be done, there is a measurable performance
> >> advantage. It can also *kind of* help security, as it will prevent a
> >> guest from doing anything useful if it changes its MAC address (but
> >> won't prevent the guest from *sending* packets with a spoofed MAC
> >> address).
> >>
> >> NB: These only work with a fixed MAC address, and no vlan tags set in
> >> the guest. Support for both of those will be coming.
> > So just to be clear on the functional differences between the
> > impls
> >
> > With the current impl
> >
> >  1. The bridge has a table mapping MAC <-> TAP devices
> >     which is initially empty
> >  2. Whenever a TAP device sends a packet, it causes an
> >     entry to be added to the MAC <-> TAP table
> >  3. When the bridge needs to forward a packet with a MAC
> >      - If MAC is unknown, it is sent to all TAP devices
> >      - Else it is send to the specific TAP device
> >  4. The physical device is always promiscuous to see all
> >     traffic on the LAN
> >
> > An result of point 2, is that if the guest changes its
> > MAC address on the fly, it merely needs to send a packet
> > and the bridge will learn its new MAC address. That said
> > our firewall rules can potentially block this dynamic
> > change.
> >
> >
> > With the new impl
> >
> >  1. The bridge has a table mapping MAC <-> TAP devices
> >     which is initially empty
> >  2. Libvirt tells the bridge what MAC address is associated
> >     with each TAP device added
> >  3. The bridge always forwards the packets to the correct
> >     TAP device
> >  4. The physical device is never promiscuous, it is
> >     programmed with MACs of all TAP devices
> >
> > In both cases the guest can send packets with spoofed
> > MAC addresses. In the old code these packets would make
> > it onto the LAN and it can receive replies back. In the
> > new code these packets would make it onto the LAN but
> > it will never receive replies back.
> >
> >
> > At a high level the conceptual difference is whether the table
> > of MAC addresses & TAP devices is statically defined by libvirt
> > or dynamically defined on the fly.
> >
> >> HERE IS THE REQUEST FOR OPINIONS/IDEAS:
> >>
> >> This V2 of the patchset addresses several issues brought up by jferlan
> >> on the original series, and changes the name of the attribute from:
> >>
> >>    promiscLinks='yes|no'
> >>
> >> to
> >>
> >>    fdb='learningWithFlood|managed'
> >>
> >> I'm somewhat more happy with this new naming than the previous but
> >> still looking for better ideas. It is closer to describing what the
> >> new code really does, but "learningWithFlood" seems a bit long and
> >> awkward, while I have been told that "fdb" is too short and
> >> unrecognizeable (I will point out that 1) "fdb" is the same name used
> >> by iproute2's "bridge" command, and 2) another bridge option, "stp" is
> >> also a three letter acronym that will only be recognized by those
> >> familiar with configuring an L2 bridge device or watching NASCAR on
> >> Saturday afternoons (or whenever it's on - not a fan myself :-))
> > One thing to remember is that libvirt is supposed to be providing a
> > higher level configuration language that is independant of specific
> > implementations.  'fdb' is terminology specific to the Linux
> > bridge implementation, so I don't think that's appropriate to use
> > in the XML configuration.
> 
> Good point. I'd been seeing this as a generic term, but I guess it isn't.
> 
> > There is a user visible change in behaviour because this new solution,
> > as implemented in this patch series, will no longer allow a guest to
> > change its MAC address on the fly.
> >
> > Of course if QEMU can provide a notification to libvirt when the guest
> > changes its MAC address, then libvirt can update the MAC table and so
> > re-gain functional equivalance with the old solution. I think we would
> > want to be able to turn that on or off though.
> 
> For macvtap interfaces, we can already do that using
> trustGuestRxFilters='yes' (a name I don't really like in a location I
> don't really like (attribute of toplevel <interface> element), but
> couldn't come up with anything better and didn't get any other
> suggestions at the time). This has been in libvirt since last month's
> release. My thought was that if trustGuestRxFilters is yes and the
> interface is on a bridge, we can then honor any changes to mac address.
> 
> (This is where my dislike of my own choice for name of that option comes
> in - I can now see a need to allow the guest to change its vlan table or
> multicast table and honor those changes, while not allowing it to change
> its MAC address, or vice versa; it probably would have been better to
> add a new element called <rxFilter> or something, and have a separate
> attribute for trusting each piece. It has now been in one release,
> though, so I guess we're stuck with it.)
> 
> >
> > This all leads me to suggest that we should use the following syntax
> > that is indepedant of the particular implementation choice, and instead
> > represents the logical behaviour of the feature.
> >
> >   macTable=static|dynamic
> >
> > The mode of 'static' means that the MAC address table will always
> > match MACs recorded in libvirt guest XML. This is the new mode
> > that your patches implement
> >
> > The mode of 'dynamic' means that the MAC address table will be
> > able to dynamically update itself as the guest changes MAC addresses.
> > In the near term, this will use the traditional bridge learning
> > mode. In the long term, if we can get MAC change notifications from
> > QEMU, we can switch this over to use the new bridge features.
> 
> I do think that eventually libvirt should make self-management of the
> mac table the default, but even then I think we need to retain the
> ability to disable it and let the kernel do as it may, just in case
> there are compatibility problems if for no other reason.
> 
> Also, I think that enabling/disabling guest-initiated MAC address
> changes should be possible at the per-interface level, not (just)
> per-network. This can be done with the above-mentioned
> trustGuestRxFilters attribute (which can be set in a single interface,
> or for a network or network portgroup).
> 
> So I like naming the attribute "macTable", as that is more generic, but
> I'm not sure about the values "static" and "dynamic", because "dynamic"
> could describe two different things (dynamically changed by the kernel
> via learning and flood, or dynamically changed via libvirt being an
> omniscient hypervisor god that knows what to plug into the table), and
> "static" is something that can already be described via setting
> trustGuestRxFilters='no' combined with a "let libvirt control the mac
> table" option.
> 
> Maybe the problem is that you're providing a good name based entirely on
> the differences in behavior the guests can see, but what I'm looking for
> is a way to control how that behavior is provided, so more of a backend
> thing. (Note that if the "backend" is the bridge module, "static" isn't
> possible now and never will be, while if the backend is explicit
> management by libvirt, then "dynamic" isn't currently available, but
> will be in the future).

So perhaps more along the lines of  macTableManager="kernel|libvirt"

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list