← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1526974] [NEW] KeyError prevents openvswitch agent from starting

 

Public bug reported:

On Liberty I ran into a situation where the openvswitch agent won't
start and fails with the following stack trace:


2015-12-16 16:01:42.852 10772 CRITICAL neutron [req-afb4e123-1940-48df-befc-9319516152b5 - - - - -] KeyError: 8
2015-12-16 16:01:42.852 10772 ERROR neutron Traceback (most recent call last):
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/bin/neutron-openvswitch-agent", line 11, in <module>
2015-12-16 16:01:42.852 10772 ERROR neutron     sys.exit(main())
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/cmd/eventlet/plugins/ovs_neutron_agent.py", line 20, in main
2015-12-16 16:01:42.852 10772 ERROR neutron     agent_main.main()
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/main.py", line 49, in main
2015-12-16 16:01:42.852 10772 ERROR neutron     mod.main()
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/ovs_ofctl/main.py", line 36, in main
2015-12-16 16:01:42.852 10772 ERROR neutron     ovs_neutron_agent.main(bridge_classes)
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1913, in main
2015-12-16 16:01:42.852 10772 ERROR neutron     agent = OVSNeutronAgent(bridge_classes, **agent_config)
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 302, in __init__
2015-12-16 16:01:42.852 10772 ERROR neutron     self._restore_local_vlan_map()
2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 358, in _restore_local_vlan_map
2015-12-16 16:01:42.852 10772 ERROR neutron     self.available_local_vlans.remove(local_vlan)
2015-12-16 16:01:42.852 10772 ERROR neutron KeyError: 8
2015-12-16 16:01:42.852 10772 ERROR neutron


Somehow the ovs table ended up with 2 ports with the same local vlan tag.

# ovs-vsctl -- --columns=name,tag,other_config list Port | grep -E
'qvob7ba561c-e5|qvod3e1f984-0e' -A 2

name                : "qvob7ba561c-e5"
tag                 : 8
other_config        : {net_uuid="fb33e234-714d-44f8-8728-1a466ef5aca0", network_type=vxlan, physical_network=None, segmentation_id="5969"}
--
name                : "qvod3e1f984-0e"
tag                 : 8
other_config        : {net_uuid="47e0f11c-7aa4-4eb4-97dc-0ef4e064680c", network_type=vxlan, physical_network=None, segmentation_id="5836"}


Additionally, I noticed the ofport for one of them was -1.

# ovs-vsctl -- --columns=name,ofport,external_ids list Interface | grep
-E 'qvob7ba561c-e5|qvod3e1f984-0e' -A 2

name                : "qvod3e1f984-0e"
ofport              : 20
external_ids        : {attached-mac="fa:16:3e:d7:eb:05", iface-id="d3e1f984-0e4f-4d39-a074-1c0809ad864c", iface-status=active, vm-uuid="a00032c8-f516-42e3-865e-1988768bab84"}
--
name                : "qvob7ba561c-e5"
ofport              : -1
external_ids        : {attached-mac="fa:16:3e:a9:c3:69", iface-id="b7ba561c-e5a2-4128-b36c-9484a763f4de", iface-status=active, vm-uuid="71873533-a4ab-4af6-8ace-e75c60b828f9"}


I'm not sure if this is relevant, but the VM that has -1 ofport is in the following state

+--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+
| ID                                   | Name                                                 | Tenant ID                        | Status    | Task State | Power State | Networks                                              |
+--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+
| 71873533-a4ab-4af6-8ace-e75c60b828f9 | test-instance-1                                | 99e641ee27434c36b4f83fbee0599e67 | SHUTOFF   | -          | Shutdown    |                                                       |
+--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+


------------------------------------------------------------------------------------------------------------------------------------
Neutron Version: 69d531565dcd180f6f1141bad143b3ea4dcd7ade

Operating System: CentOS Linux 7 (Core)
Kernel: Linux 3.10.0-229.11.1.el7.x86_64
Architecture: x86_64

ovs-vsctl (Open vSwitch) 2.3.1
Compiled Dec 26 2014 15:35:14
DB Schema 7.6.2

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1526974

Title:
  KeyError prevents openvswitch agent from starting

Status in neutron:
  New

Bug description:
  On Liberty I ran into a situation where the openvswitch agent won't
  start and fails with the following stack trace:

  
  2015-12-16 16:01:42.852 10772 CRITICAL neutron [req-afb4e123-1940-48df-befc-9319516152b5 - - - - -] KeyError: 8
  2015-12-16 16:01:42.852 10772 ERROR neutron Traceback (most recent call last):
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/bin/neutron-openvswitch-agent", line 11, in <module>
  2015-12-16 16:01:42.852 10772 ERROR neutron     sys.exit(main())
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/cmd/eventlet/plugins/ovs_neutron_agent.py", line 20, in main
  2015-12-16 16:01:42.852 10772 ERROR neutron     agent_main.main()
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/main.py", line 49, in main
  2015-12-16 16:01:42.852 10772 ERROR neutron     mod.main()
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/ovs_ofctl/main.py", line 36, in main
  2015-12-16 16:01:42.852 10772 ERROR neutron     ovs_neutron_agent.main(bridge_classes)
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1913, in main
  2015-12-16 16:01:42.852 10772 ERROR neutron     agent = OVSNeutronAgent(bridge_classes, **agent_config)
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 302, in __init__
  2015-12-16 16:01:42.852 10772 ERROR neutron     self._restore_local_vlan_map()
  2015-12-16 16:01:42.852 10772 ERROR neutron   File "/opt/neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 358, in _restore_local_vlan_map
  2015-12-16 16:01:42.852 10772 ERROR neutron     self.available_local_vlans.remove(local_vlan)
  2015-12-16 16:01:42.852 10772 ERROR neutron KeyError: 8
  2015-12-16 16:01:42.852 10772 ERROR neutron

  
  Somehow the ovs table ended up with 2 ports with the same local vlan tag.

  # ovs-vsctl -- --columns=name,tag,other_config list Port | grep -E
  'qvob7ba561c-e5|qvod3e1f984-0e' -A 2

  name                : "qvob7ba561c-e5"
  tag                 : 8
  other_config        : {net_uuid="fb33e234-714d-44f8-8728-1a466ef5aca0", network_type=vxlan, physical_network=None, segmentation_id="5969"}
  --
  name                : "qvod3e1f984-0e"
  tag                 : 8
  other_config        : {net_uuid="47e0f11c-7aa4-4eb4-97dc-0ef4e064680c", network_type=vxlan, physical_network=None, segmentation_id="5836"}

  
  Additionally, I noticed the ofport for one of them was -1.

  # ovs-vsctl -- --columns=name,ofport,external_ids list Interface |
  grep -E 'qvob7ba561c-e5|qvod3e1f984-0e' -A 2

  name                : "qvod3e1f984-0e"
  ofport              : 20
  external_ids        : {attached-mac="fa:16:3e:d7:eb:05", iface-id="d3e1f984-0e4f-4d39-a074-1c0809ad864c", iface-status=active, vm-uuid="a00032c8-f516-42e3-865e-1988768bab84"}
  --
  name                : "qvob7ba561c-e5"
  ofport              : -1
  external_ids        : {attached-mac="fa:16:3e:a9:c3:69", iface-id="b7ba561c-e5a2-4128-b36c-9484a763f4de", iface-status=active, vm-uuid="71873533-a4ab-4af6-8ace-e75c60b828f9"}

  
  I'm not sure if this is relevant, but the VM that has -1 ofport is in the following state

  +--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+
  | ID                                   | Name                                                 | Tenant ID                        | Status    | Task State | Power State | Networks                                              |
  +--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+
  | 71873533-a4ab-4af6-8ace-e75c60b828f9 | test-instance-1                                | 99e641ee27434c36b4f83fbee0599e67 | SHUTOFF   | -          | Shutdown    |                                                       |
  +--------------------------------------+------------------------------------------------------+----------------------------------+-----------+------------+-------------+-------------------------------------------------------+


  ------------------------------------------------------------------------------------------------------------------------------------
  Neutron Version: 69d531565dcd180f6f1141bad143b3ea4dcd7ade

  Operating System: CentOS Linux 7 (Core)
  Kernel: Linux 3.10.0-229.11.1.el7.x86_64
  Architecture: x86_64

  ovs-vsctl (Open vSwitch) 2.3.1
  Compiled Dec 26 2014 15:35:14
  DB Schema 7.6.2

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1526974/+subscriptions


Follow ups