← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1865891] [NEW] Race condition during removal of subnet from the router and removal of subnet

 

Public bug reported:

Bug originally reported in
https://bugzilla.redhat.com/show_bug.cgi?id=1806963 but I was also able
to reproduce it on master branch.

Original bug description:

I tried to perform the following actions in background:
 1. Create subnet from pool
 2. Attach subnet to router
 3. Detach subnet from router
 4. Delete subnet
 5. Sleep 2 seconds
 6. GOTO 1

It failed with one of the following errors in l3-agent.log:

[-] Error while deleting router 9935b2d9-65af-4d5e-b0d4-7988cd638e66:
KeyError: 'subnets'
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 385, in _safe_router_removed
    self._router_removed(router_id)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 404, in _router_removed
    ri.delete()
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 459, in delete
    super(HaRouter, self).delete()
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 421, in delete
    self.process_delete()
  File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
    self.logger(e)
  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
    self.force_reraise()
  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1164, in process_delete
    self._process_internal_ports()
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 575, in _process_internal_ports
    for subnet in p['subnets']:
KeyError: 'subnets'


[-] Failed to process compatible router:
9935b2d9-65af-4d5e-b0d4-7988cd638e66: KeyError: 'mtu'
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 628, in _process_routers_if_compatible
    self._process_router_if_compatible(router)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 486, in _process_router_if_compatible
    self._process_updated_router(router)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 527, in _process_updated_router
    ri.process()
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 474, in process
    super(HaRouter, self).process()
  File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
    self.logger(e)
  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
    self.force_reraise()
  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1181, in process
    self._process_internal_ports()
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 567, in _process_internal_ports
    internal_ports)
  File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 515, in _get_updated_ports
    mtu_changed = existing_port['mtu'] != current_port['mtu']
KeyError: 'mtu'


In l3-agent.log file I can see that there is no information about subnet and IP address:

(output is formatted)
2020-02-25 09:59:33.846 875552 DEBUG neutron.agent.l3.router_info [-] appending port 
{
  u'allowed_address_pairs': [
    
  ],
  u'extra_dhcp_opts': [
    
  ],
  u'updated_at': u'2020-02-25T09:59:33Z',
  u'device_owner': u'network:ha_router_replicated_interface',
  u'revision_number': 11,
  u'port_security_enabled': False,
  u'binding:profile': {
    
  },
  u'fixed_ips': [
    
  ],
  u'id': u'30b654b9-0d09-407d-8553-b84c0d36e5ef',
  u'security_groups': [
    
  ],
  u'binding:vif_details': {
    u'port_filter': True,
    u'datapath_type': u'system',
    u'ovs_hybrid_plug': True
  },
  u'binding:vif_type': u'ovs',
  u'qos_policy_id': None,
  u'mac_address': u'fa:16:3e:6b:13:79',
  u'project_id': u'e364e04c62d845a0ac682782a07712ee',
  u'status': u'DOWN',
  u'binding:host_id': u'controller-0.redhat.local',
  u'description': u'',
  u'tags': [
    
  ],
  u'device_id': u'6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',
  u'name': u'',
  u'admin_state_up': True,
  u'network_id': u'2506f745-6581-4b9a-8dde-8c11ebf1d7cb',
  u'tenant_id': u'e364e04c62d845a0ac682782a07712ee',
  u'created_at': u'2020-02-25T09:59:28Z',
  u'binding:vnic_type': u'normal',
  u'ip_allocation': u'immediate'
} 
to internal_ports cache _process_internal_ports /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py:583


Based on openvswitch-agent.log it seems that the subnet is deleted
before the port configuration is compleate:

2020-02-25 09:59:29.901 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created
_at=2020-02-25T09:59:28Z,data_plane_status=<?>,description='',device_id='6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',device_owner='network:ha_router_replicate
d_interface',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=30b654b9-0d09-407d-8553-b84c0d36e5ef,mac_address=fa:16:3e:6b:
13:79,name='',network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,project_id='e364e04c62d845a0ac682782a07712ee',qos_policy_id=None,revision_number=5,securit
y=PortSecurity(30b654b9-0d09-407d-8553-b84c0d36e5ef),security_group_ids=set([]),status='DOWN',updated_at=2020-02-25T09:59:29Z) record_resource_update /u
sr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187

2020-02-25 09:59:30.022 107901 DEBUG neutron.agent.resource_cache [req-ee7510d2-69cc-49a6-bdfa-4455d7df47ee 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Subnet deleted: 5561834a-9bf3-41e7-ac87-d2d0eae65ca7 record_resource_delete /usr/lib/python2.7/site-packages/neutr
on/agent/resource_cache.py:197

2020-02-25 09:59:30.436 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Port 30b654b9-0d09-407d-8553-b84c0d36e5ef updated (revision_number 5->7). Old fields: {'fixed_ips': [IPAllocation(ip_address=10.108.108.1,network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,port_id=30b654b9-0d09-407d-8553-b84c0d36e5ef,subnet_id=5561834a-9bf3-41e7-ac87-d2d0eae65ca7)]} New fields: {'fixed_ips': []} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185


Version-Release number of selected component (if applicable):
OpenStack-13.0-RHEL-7-20200214.1


Steps to Reproduce:

openstack subnet pool create --pool-prefix 10.108.108.0/24 the_new_subnet_pool
openstack network create the_new_network_1
openstack router create the_new_router

for i in {1..10};
do
    openstack subnet create --subnet-pool the_new_subnet_pool
--prefix-length 27 --network the_new_network_1 the_new_subnet_1 &
    openstack router add subnet the_new_router the_new_subnet_1 &
    openstack router remove subnet the_new_router the_new_subnet_1 &
    openstack subnet delete the_new_subnet_1 &
    sleep 2
done


The issue causes the following errors:
1. All the interfaces are removed from router's namespace
2. Can't assign new subnets/ports to the router
3. Can't delete router

** Affects: neutron
     Importance: Medium
     Assignee: Slawek Kaplonski (slaweq)
         Status: Confirmed


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1865891

Title:
  Race condition during removal of subnet from the router and removal of
  subnet

Status in neutron:
  Confirmed

Bug description:
  Bug originally reported in
  https://bugzilla.redhat.com/show_bug.cgi?id=1806963 but I was also
  able to reproduce it on master branch.

  Original bug description:

  I tried to perform the following actions in background:
   1. Create subnet from pool
   2. Attach subnet to router
   3. Detach subnet from router
   4. Delete subnet
   5. Sleep 2 seconds
   6. GOTO 1

  It failed with one of the following errors in l3-agent.log:

  [-] Error while deleting router 9935b2d9-65af-4d5e-b0d4-7988cd638e66:
  KeyError: 'subnets'
  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
  line 385, in _safe_router_removed
      self._router_removed(router_id)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
  line 404, in _router_removed
      ri.delete()
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
  line 459, in delete
      super(HaRouter, self).delete()
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 421, in delete
      self.process_delete()
    File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
  line 165, in call
      self.logger(e)
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
  220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
  196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
  line 162, in call
      return func(*args, **kwargs)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 1164, in process_delete
      self._process_internal_ports()
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 575, in _process_internal_ports
      for subnet in p['subnets']:
  KeyError: 'subnets'

  
  [-] Failed to process compatible router:
  9935b2d9-65af-4d5e-b0d4-7988cd638e66: KeyError: 'mtu'
  Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
  line 628, in _process_routers_if_compatible
      self._process_router_if_compatible(router)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
  line 486, in _process_router_if_compatible
      self._process_updated_router(router)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
  line 527, in _process_updated_router
      ri.process()
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
  line 474, in process
      super(HaRouter, self).process()
    File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
  line 165, in call
      self.logger(e)
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
  220, in __exit__
      self.force_reraise()
    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
  196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
  line 162, in call
      return func(*args, **kwargs)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 1181, in process
      self._process_internal_ports()
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 567, in _process_internal_ports
      internal_ports)
    File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
  line 515, in _get_updated_ports
      mtu_changed = existing_port['mtu'] != current_port['mtu']
  KeyError: 'mtu'

  
  In l3-agent.log file I can see that there is no information about subnet and IP address:

  (output is formatted)
  2020-02-25 09:59:33.846 875552 DEBUG neutron.agent.l3.router_info [-] appending port 
  {
    u'allowed_address_pairs': [
      
    ],
    u'extra_dhcp_opts': [
      
    ],
    u'updated_at': u'2020-02-25T09:59:33Z',
    u'device_owner': u'network:ha_router_replicated_interface',
    u'revision_number': 11,
    u'port_security_enabled': False,
    u'binding:profile': {
      
    },
    u'fixed_ips': [
      
    ],
    u'id': u'30b654b9-0d09-407d-8553-b84c0d36e5ef',
    u'security_groups': [
      
    ],
    u'binding:vif_details': {
      u'port_filter': True,
      u'datapath_type': u'system',
      u'ovs_hybrid_plug': True
    },
    u'binding:vif_type': u'ovs',
    u'qos_policy_id': None,
    u'mac_address': u'fa:16:3e:6b:13:79',
    u'project_id': u'e364e04c62d845a0ac682782a07712ee',
    u'status': u'DOWN',
    u'binding:host_id': u'controller-0.redhat.local',
    u'description': u'',
    u'tags': [
      
    ],
    u'device_id': u'6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',
    u'name': u'',
    u'admin_state_up': True,
    u'network_id': u'2506f745-6581-4b9a-8dde-8c11ebf1d7cb',
    u'tenant_id': u'e364e04c62d845a0ac682782a07712ee',
    u'created_at': u'2020-02-25T09:59:28Z',
    u'binding:vnic_type': u'normal',
    u'ip_allocation': u'immediate'
  } 
  to internal_ports cache _process_internal_ports /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py:583


  Based on openvswitch-agent.log it seems that the subnet is deleted
  before the port configuration is compleate:

  2020-02-25 09:59:29.901 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
  845a0ac682782a07712ee - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created
  _at=2020-02-25T09:59:28Z,data_plane_status=<?>,description='',device_id='6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',device_owner='network:ha_router_replicate
  d_interface',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=30b654b9-0d09-407d-8553-b84c0d36e5ef,mac_address=fa:16:3e:6b:
  13:79,name='',network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,project_id='e364e04c62d845a0ac682782a07712ee',qos_policy_id=None,revision_number=5,securit
  y=PortSecurity(30b654b9-0d09-407d-8553-b84c0d36e5ef),security_group_ids=set([]),status='DOWN',updated_at=2020-02-25T09:59:29Z) record_resource_update /u
  sr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187

  2020-02-25 09:59:30.022 107901 DEBUG neutron.agent.resource_cache [req-ee7510d2-69cc-49a6-bdfa-4455d7df47ee 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
  845a0ac682782a07712ee - - -] Resource Subnet deleted: 5561834a-9bf3-41e7-ac87-d2d0eae65ca7 record_resource_delete /usr/lib/python2.7/site-packages/neutr
  on/agent/resource_cache.py:197

  2020-02-25 09:59:30.436 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
  845a0ac682782a07712ee - - -] Resource Port 30b654b9-0d09-407d-8553-b84c0d36e5ef updated (revision_number 5->7). Old fields: {'fixed_ips': [IPAllocation(ip_address=10.108.108.1,network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,port_id=30b654b9-0d09-407d-8553-b84c0d36e5ef,subnet_id=5561834a-9bf3-41e7-ac87-d2d0eae65ca7)]} New fields: {'fixed_ips': []} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185


  Version-Release number of selected component (if applicable):
  OpenStack-13.0-RHEL-7-20200214.1

  
  Steps to Reproduce:

  openstack subnet pool create --pool-prefix 10.108.108.0/24 the_new_subnet_pool
  openstack network create the_new_network_1
  openstack router create the_new_router

  for i in {1..10};
  do
      openstack subnet create --subnet-pool the_new_subnet_pool
  --prefix-length 27 --network the_new_network_1 the_new_subnet_1 &
      openstack router add subnet the_new_router the_new_subnet_1 &
      openstack router remove subnet the_new_router the_new_subnet_1 &
      openstack subnet delete the_new_subnet_1 &
      sleep 2
  done


  The issue causes the following errors:
  1. All the interfaces are removed from router's namespace
  2. Can't assign new subnets/ports to the router
  3. Can't delete router

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1865891/+subscriptions


Follow ups