← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1421105] [NEW] L2 population sometimes failed with multiple neutron-server

 

Public bug reported:

In my environment with two neutron-server, 'mechanism_drivers' is openvswitch, l2 population is set.
When I delete a VM which is the network-A  last VM in compute node-A, I found a KeyError in  compute node-B openvswitch-agent log, it throws by 'del_fdb_flow':

    def del_fdb_flow(self, br, port_info, remote_ip, lvm, ofport):
        if port_info == q_const.FLOODING_ENTRY:
            lvm.tun_ofports.remove(ofport)
            if len(lvm.tun_ofports) > 0:
                ofports = _ofport_set_to_str(lvm.tun_ofports)
                br.mod_flow(table=constants.FLOOD_TO_TUN,
                            dl_vlan=lvm.vlan,
                            actions="strip_vlan,set_tunnel:%s,output:%s" %
                            (lvm.segmentation_id, ofports))

the reason is that openvswitch-agent  receives two RPC request
'fdb_remove', why it receives twice, I think the reason is that:

there are two neutron-server: neutron-serverA, neutron-serverB, one compute node-A
1. nova delete VM which is in compute node-A, it will firstly delete the TAP device, then the ovs scans the port is deleted, it send RPC request 'update_device_down' to  neutron-serverA, when neutron-serverA receive this request, l2 population will firstly send 'fdb_remove'
2. after nova delete the TAP device, it send REST API request 'delete_port' to neutron-serveB, the l2 population send second 'fdb_remove' RPC request
when ovs agent receive the second  'fdb_remove', it del_fdb_flow, the 'lvm.tun_ofports.remove(ofport)' throw KeyError, because 
the ofport is deleted in first request

** Affects: neutron
     Importance: Undecided
     Assignee: shihanzhang (shihanzhang)
         Status: New

** Changed in: neutron
     Assignee: (unassigned) => shihanzhang (shihanzhang)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1421105

Title:
  L2 population sometimes failed with multiple neutron-server

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  In my environment with two neutron-server, 'mechanism_drivers' is openvswitch, l2 population is set.
  When I delete a VM which is the network-A  last VM in compute node-A, I found a KeyError in  compute node-B openvswitch-agent log, it throws by 'del_fdb_flow':

      def del_fdb_flow(self, br, port_info, remote_ip, lvm, ofport):
          if port_info == q_const.FLOODING_ENTRY:
              lvm.tun_ofports.remove(ofport)
              if len(lvm.tun_ofports) > 0:
                  ofports = _ofport_set_to_str(lvm.tun_ofports)
                  br.mod_flow(table=constants.FLOOD_TO_TUN,
                              dl_vlan=lvm.vlan,
                              actions="strip_vlan,set_tunnel:%s,output:%s" %
                              (lvm.segmentation_id, ofports))

  the reason is that openvswitch-agent  receives two RPC request
  'fdb_remove', why it receives twice, I think the reason is that:

  there are two neutron-server: neutron-serverA, neutron-serverB, one compute node-A
  1. nova delete VM which is in compute node-A, it will firstly delete the TAP device, then the ovs scans the port is deleted, it send RPC request 'update_device_down' to  neutron-serverA, when neutron-serverA receive this request, l2 population will firstly send 'fdb_remove'
  2. after nova delete the TAP device, it send REST API request 'delete_port' to neutron-serveB, the l2 population send second 'fdb_remove' RPC request
  when ovs agent receive the second  'fdb_remove', it del_fdb_flow, the 'lvm.tun_ofports.remove(ofport)' throw KeyError, because 
  the ofport is deleted in first request

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1421105/+subscriptions


Follow ups

References