Could you please help with questions about the net failover feature

Wed Jul 8 17:53:41 UTC 2020

On 7/8/20 10:02 AM, Ken Cox wrote:
> 
> On 7/8/20 1:30 AM, Stefan Assmann wrote:
>> On 2020-07-06 10:01, Laine Stump wrote:
>>> On 7/6/20 5:10 AM, Yalan Zhang wrote:
>>>> Hi Laine,
>>>>
>>>> For the feature testing before, I only test the linux bridge setting as
>>>> in 2), it works.
>>>> Now I tried 1), to use macvtap bridge mode connected to the PF, it can
>>>> not work as the hostdev interface can not get dhcp ip address on the
>>>> guest.
>>>> Check on host, the /var/log/messages and dmesg both says:
>>>>
>>>> "Jul  6 04:54:45 dell-per730-xx kernel: ixgbe 0000:82:00.1 
>>>> enp130s0f1: 1
>>>> Spoofed packets detected
>>>> ......
>>>> Jul  6 04:56:17 dell-per730-xx kernel: ixgbe 0000:82:00.1 enp130s0f1: 1
>>>> Spoofed packets detected
>>>> Jul  6 04:56:54 dell-per730-xx kernel: ixgbe 0000:82:00.1 enp130s0f1: 1
>>>> Spoofed packets detected
>>>> "
>>>> (enp130s0f1 is the PF's interface name, and 0000:82:00.1 is the PF's 
>>>> pci
>>>> address)
>>>> # rpm -q kernel
>>>> kernel-4.18.0-193.4.1.el8_2.x86_64
>>>>
>>>> Could you please help to confirm if this is a kernel bug?  Thank you
>>>> very much!
>>> Interesting. I'm not sure if this is expected behavior, or if it's 
>>> improper
>>> behavior and it just hasn't been tested before (obviously based on my
>>> earlier recommendation, I think it *should* be able to work like 
>>> this, and I
>>> *thought* I had tried it, but maybe I just imagined it :-/).
>>>
>>> I'm Cc'ing Stefan Assmann to see if he has an opinion on whether or 
>>> not this
>>> should work. For his convenience, here is a summary of the config: 
>>> The setup
>>> is that there is a bridge-mode macvtap interface on the PF, and one 
>>> of the
>>> VF's has been given the same MAC address as the macvtap. the macvtap
>>> interface is connected to an emulated NIC in the guest, and the VF is
>>> assigned to the guest with VFIO.
> 
> IIUC, the problem is using the same mac address for macvtap and for a 
> VF.  This is what's causing the spoofed packets.

How is it a spoof? Because one interface detects a packet not 
originating from itself that has its own MAC address?

Is this check done by the kernel, or by the firmware on the card?

Both interfaces are specifically and consciously configured with the 
same MAC address (since that is a requirement for the simplified bonding 
provided by the virtio-net "failover" feature).

If DHCP isn't working, then I guess the guest is sending a DHCP discover 
packet out through the VF. How is this packet triggering anti-spoof 
protection, since it is the legitimate MAC address of that interface?

Or am I misinterpreting what's going on? (the log message just says a 
spoofed packet was detected, so it could be some other packet triggering 
the log, and this is all just a red herring...)

>>
>>> I'll try this today on my setup, which uses I350 cards (igb driver).
>>>
>>>
>>>>
>>>>
>>>> You have two choices for the backup virtio interface:
>>>>
>>>> 1) it can be a macvtap device connected to the PF of the same SRIOV 
>>>> device.
>>>>
>>>> 2) it can be a standard tap device connected to a Linux host bridge
>>>> (created outside libvirt in the host system network config) that is
>>>> attached to the PF (or alternately one of the VFs that isn't being used
>>>> for VMs, or to another physical ethernet adapter on the host that is
>>>> connected to the same network.
>>>>
>>>>
>>>>
>>>>
>>>> -------
>>>> Best Regards,
>>>> Yalan Zhang
>>>> IRC: yalzhang
>>>>
>>>>
>>>> On Sun, Mar 22, 2020 at 6:50 AM Laine Stump <laine at redhat.com
>>>> <mailto:laine at redhat.com>> wrote:
>>>>
>>>>      On 3/21/20 1:08 AM, Yalan Zhang wrote:
>>>>
>>>>       > In my understanding, the standby and primary hostdev interface
>>>>      may be in
>>>>       > different subnet.
>>>>
>>>>      There is only one hostdev device in the team pair (that will be 
>>>> the one
>>>>      with <teaming type='transient'.../> since it needs to be unplugged
>>>>      during migration). The other device must be a virtio device 
>>>> (the one
>>>>      with <teaming type='persistent'/>). And no, they cannot be on 
>>>> different
>>>>      subnets. They must both connect into the same ethernet "collision
>>>>      domain", such that the guest could assign the same IP address 
>>>> to either
>>>>      of them and be able to communicate on the network.
>>>>
>>>>      There is some explanation of the use case for this option. and 
>>>> some
>>>>      example config, here:
>>>>
>>>>      https://www.libvirt.org/formatdomain.html#elementsTeaming
>>>>
>>>>       > I'm not sure whether it is correct. Could you please help to
>>>>      explain?
>>>>       > Thank you in advance.
>>>>       >
>>>>       > For example, primary hostdev is connected to vf-pool with
>>>>      <pf='eth0'/>,
>>>>       > while the standby is connected to NAT network with " forward
>>>>      dev='eth0'".
>>>>       > The standby interface will get ip as 192.168.122.x, but after
>>>>      NAT, it
>>>>       > will be in the same subnet of the vf.
>>>>        >
>>>>       > So after the VF is unplugged, the packet will still 
>>>> broadcast in the
>>>>       > same subnet, and the vm will get the packet as the standby 
>>>> share the
>>>>       > same mac. Right?
>>>>
>>>>      No, not right :-)
>>>>
>>>>      The VF of an SRIOV network adapter is connected directly to the
>>>>      physical
>>>>      network, and will have an IP address that is on that network. Tap
>>>>      devices plugged into the default network (or any other libvirt 
>>>> network
>>>>      based on a bridge device that is created/managed by libvirt) 
>>>> have no
>>>>      direct connection to the physical network, and are on a different
>>>>      subnet. The fact that traffic from the guest *seems* to be 
>>>> coming from
>>>>      an IP on the physical subnet is meaningless. The *guest* needs 
>>>> to be
>>>>      able to use both NICs using the same IP address, and anything 
>>>> plugged
>>>>      into the default network will need to have an IP address on a 
>>>> different
>>>>      subnet from the perspective of the guest.
>>>>
>>>>      You have two choices for the backup virtio interface:
>>>>
>>>>      1) it can be a macvtap device connected to the PF of the same 
>>>> SRIOV
>>>>      device.
>>>>
>>>>      2) it can be a standard tap device connected to a Linux host 
>>>> bridge
>>>>      (created outside libvirt in the host system network config) 
>>>> that is
>>>>      attached to the PF (or alternately one of the VFs that isn't 
>>>> being used
>>>>      for VMs, or to another physical ethernet adapter on the host 
>>>> that is
>>>>      connected to the same network.
>>>>
>>>>
>>>>      It is simplest to have the same name refer to the connection on 
>>>> the
>>>>      source and destination hosts of a migration. That can be 
>>>> handled by
>>>>      creating a libvirt network to refer to the bridge device created
>>>>      outside
>>>>      libvirt (or to the PF directly if you're going to use macvtap.
>>>>
>>>>      For example, if you're going to use macvtap, and the PF's name 
>>>> on the
>>>>      host is ens4f0, you'd just create this network:
>>>>
>>>>          <network>
>>>>            <name>persistent-net</name>
>>>>            <forward mode='bridge'>
>>>>              <interface dev='ens4f0'/>
>>>>            </forward>
>>>>           <network>
>>>>
>>>>      any guest interface with this:
>>>>
>>>>             <interface type='network'>
>>>>               <source network='persistent-net'/>
>>>>               <mac address='00:11:22:33:44:55'/>
>>>>               <model type='virtio'/>
>>>>               <teaming type='persistent'/>
>>>>               <alias name='ua-backup0'/>
>>>>             </interface>
>>>>
>>>>      will get a macvtap device that's connected to ens4f0 in bridge 
>>>> mode.
>>>>
>>>>      Or, if your host has a bridge device called br0 that is directly
>>>>      attached to the physical network (in whatever manner, it doesn't
>>>>      matter), you can define the network this way:
>>>>
>>>>          <network>
>>>>            <name>persistent-net</name>
>>>>            <bridge name='br0'/>
>>>>            <forward mode='bridge'/>
>>>>           <network>
>>>>
>>>>      The XML for the guest interface would be the same.
>>>>
>>>>      Then for the vfio (transient) interface, you could also define a
>>>>      network:
>>>>
>>>>           <network>
>>>>             <name>transient-net</name>
>>>>             <forward mode='hostdev'>
>>>>               <pf dev='ens4f0'/>
>>>>             </forward>
>>>>           </network>
>>>>
>>>>      and instead of using <interface type='hostdev'> in the guest 
>>>> config,
>>>>      you
>>>>      would use this:
>>>>
>>>>
>>>>            <interface type='network'>
>>>>              <source network='transient-net'/>
>>>>              <model type='virtio'/>    [1]
>>>>              <mac address='00:11:22:33:44:55'/>
>>>>              <teaming type='transient' persistent='ua-backup0'/>
>>>>           </interface>
>>>>
>>>>      Even if the device names change on the other host (the 
>>>> destination of
>>>>      the migration), as long as the other host has networks named
>>>>      "persistent-net" and "transient-net" that are of similar types 
>>>> (macvtap
>>>>      or bridged for persistent-net, and hostdev for transient-net) then
>>>>      libvirt will be able to migrate the guest from one host to the 
>>>> other
>>>>      with no mangling of the XML required.
>>>>
>>>>