[rhos-list] OpenVswitch problem...

Flavio Leitner fbl at redhat.com
Mon Oct 6 22:58:49 UTC 2014


On Fri, Oct 03, 2014 at 10:01:36AM -0400, Perry Myers wrote:
> On 10/02/2014 09:29 AM, Krist van Besien wrote:
> > Hello,
> > 
> > We are using OpenVswitch as part of a Openstack Havana installation in
> > RHEL. We are having some problems getting some tenant networks talking
> > to an external network.
> 
> Is this RHEL OSP 4 on RHEL 6.5? Can you give us the name-version-release
> of the Neutron and Openvswitch packages?
> 
> > After some very deep digging we managed to nail it down to a
> > particular bridge -external interface combination. On this bridge I am
> > seeing something I cannot explain.
> 
> I'm adding some folks to the thread who have expertise in this area
> (Maru, Flavio)
> 
> > The situation:
> > OVS 2.1.3
> > 
> > We have a bridge br700
> > 
> > # ovs-ofctl show br700
> > OFPT_FEATURES_REPLY (xid=0x2): dpid:0000c81f66db4493
> > n_tables:254, n_buffers:256
> > capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
> > actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC
> > SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST
> > ENQUEUE
> >  1(bond0.700): addr:c8:1f:66:db:44:93
> >      config:     0
> >      state:      0
> >      current:    10GB-FD
> >      speed: 10000 Mbps now, 0 Mbps max
> >  10(phy-br700): addr:0e:37:ca:a0:ff:1d
> >      config:     0
> >      state:      0
> >      current:    10GB-FD COPPER
> >      speed: 10000 Mbps now, 0 Mbps max
> >  11(qg-8a9cdc1f-36): addr:c8:1f:66:db:44:93
> >      config:     PORT_DOWN
> >      state:      LINK_DOWN
> >      speed: 0 Mbps now, 0 Mbps max
> >  12(qg-48834c87-cc): addr:c8:1f:66:db:44:93
> >      config:     PORT_DOWN
> >      state:      LINK_DOWN
> >      speed: 0 Mbps now, 0 Mbps max
> >  LOCAL(br700): addr:c8:1f:66:db:44:93
> >      config:     0
> >      state:      0
> >      speed: 0 Mbps now, 0 Mbps max
> > OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
> > 
> > (Note,  ports 11 and 12 are shown as down, but seem to work just fine,
> > is this a problem, or a red herring?)
> > 
> > Port one is our link to the network. It tags everything with vlan700
> > 
> > port 11 and 12 are each a veth, connected in a linux network
> > namespace. Those network namespaces represent Openstack routering
> > instances.
> > 
> > To test I enter one of those namespaces:
> > 
> > sudo ip netns exec qrouter-e29f5fd6-a223-4e9b-
> > 8615-5809161f821e  bash
> > # ip a
> > 283: qg-8a9cdc1f-36: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> > state UNKNOWN
> >     link/ether fa:16:3e:e3:d8:be brd ff:ff:ff:ff:ff:ff
> >     inet 10.255.10.2/16 brd 10.255.255.255 scope global qg-8a9cdc1f-36
> >     inet6 fe80::f816:3eff:fee3:d8be/64 scope link
> >        valid_lft forever preferred_lft forever
> > 253: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
> >     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >     inet 127.0.0.1/8 scope host lo
> >     inet6 ::1/128 scope host
> >        valid_lft forever preferred_lft forever
> > 
> > When I now ping the ip address of the other routing instance this
> > works. But not straightaway:
> > 
> > # ping 10.255.10.1
> > PING 10.255.10.1 (10.255.10.1) 56(84) bytes of data.
> >>From 10.255.10.2 icmp_seq=2 Destination Host Unreachable
> > ...
> >>From 10.255.10.2 icmp_seq=20 Destination Host Unreachable
> > 64 bytes from 10.255.10.1: icmp_seq=21 ttl=64 time=1.51 ms
> > 64 bytes from 10.255.10.1: icmp_seq=22 ttl=64 time=0.459 ms
> > ...
> > 
> > Querying the bridge mac table gives me:
> > 
> > # ovs-appctl fdb/show br700
> >  port  VLAN  MAC                Age
> >    12     0  fa:16:3e:43:f0:54   32
> >    11     0  fa:16:3e:e3:d8:be   32
> > 
> > So the bridge has eventually learned where to send each packet...
> > 
> > Now I test a ping to a host on the external network...
> > 
> > # ping 10.255.1.108
> > PING 10.255.1.108 (10.255.1.108) 56(84) bytes of data.
> >>From 10.255.10.2 icmp_seq=2 Destination Host Unreachable
> >>From 10.255.10.2 icmp_seq=3 Destination Host Unreachable
> > ...
> >>From 10.255.10.2 icmp_seq=31 Destination Host Unreachable
> >>From 10.255.10.2 icmp_seq=32 Destination Host Unreachable
> > ,.,
> > This never succeeds...
> > 
> > When we do a tcpdump on several spots in our network we see that an
> > arp request is send out, is answered by 10.255.1.108, and is received
> > back. However, it never makes it in to our routing instance.
> > 
> > When looking at the bridge mac table something odd appears:
> > 
> > # ovs-appctl fdb/show br700
> >  port  VLAN  MAC                Age
> >    12     0  fa:16:3e:43:f0:54  240
> >     1     0  fa:16:3e:e3:d8:be  116
> >     1     0  52:54:00:3d:ee:18  116
> > 
> > And i think that this is the source of our problem.
> > 
> > Somehow the bridge thinks that the mac address fa:16:3e:e3:d8:be lives
> > behind port 1, whereas it is of crouse behind port 11. So the arp that
> > it receices on port 1 is never forwarded.
> > So somehow it mislearned where this mac is. But why? And how do I fix this?
> > 
> > Is there a way to solve this?

Sorry the delay to reply. I was out in vacation.

See these links:
https://bugzilla.redhat.com/show_bug.cgi?id=1078828#c16
https://bugzilla.redhat.com/show_bug.cgi?id=1078828#c18
https://bugzilla.redhat.com/show_bug.cgi?id=1090562#c6

fbl




More information about the rhos-list mailing list