yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #81816
[Bug 1865891] [NEW] Race condition during removal of subnet from the router and removal of subnet
Public bug reported:
Bug originally reported in
https://bugzilla.redhat.com/show_bug.cgi?id=1806963 but I was also able
to reproduce it on master branch.
Original bug description:
I tried to perform the following actions in background:
1. Create subnet from pool
2. Attach subnet to router
3. Detach subnet from router
4. Delete subnet
5. Sleep 2 seconds
6. GOTO 1
It failed with one of the following errors in l3-agent.log:
[-] Error while deleting router 9935b2d9-65af-4d5e-b0d4-7988cd638e66:
KeyError: 'subnets'
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 385, in _safe_router_removed
self._router_removed(router_id)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 404, in _router_removed
ri.delete()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 459, in delete
super(HaRouter, self).delete()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 421, in delete
self.process_delete()
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
self.logger(e)
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1164, in process_delete
self._process_internal_ports()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 575, in _process_internal_ports
for subnet in p['subnets']:
KeyError: 'subnets'
[-] Failed to process compatible router:
9935b2d9-65af-4d5e-b0d4-7988cd638e66: KeyError: 'mtu'
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 628, in _process_routers_if_compatible
self._process_router_if_compatible(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 486, in _process_router_if_compatible
self._process_updated_router(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 527, in _process_updated_router
ri.process()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 474, in process
super(HaRouter, self).process()
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
self.logger(e)
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1181, in process
self._process_internal_ports()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 567, in _process_internal_ports
internal_ports)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 515, in _get_updated_ports
mtu_changed = existing_port['mtu'] != current_port['mtu']
KeyError: 'mtu'
In l3-agent.log file I can see that there is no information about subnet and IP address:
(output is formatted)
2020-02-25 09:59:33.846 875552 DEBUG neutron.agent.l3.router_info [-] appending port
{
u'allowed_address_pairs': [
],
u'extra_dhcp_opts': [
],
u'updated_at': u'2020-02-25T09:59:33Z',
u'device_owner': u'network:ha_router_replicated_interface',
u'revision_number': 11,
u'port_security_enabled': False,
u'binding:profile': {
},
u'fixed_ips': [
],
u'id': u'30b654b9-0d09-407d-8553-b84c0d36e5ef',
u'security_groups': [
],
u'binding:vif_details': {
u'port_filter': True,
u'datapath_type': u'system',
u'ovs_hybrid_plug': True
},
u'binding:vif_type': u'ovs',
u'qos_policy_id': None,
u'mac_address': u'fa:16:3e:6b:13:79',
u'project_id': u'e364e04c62d845a0ac682782a07712ee',
u'status': u'DOWN',
u'binding:host_id': u'controller-0.redhat.local',
u'description': u'',
u'tags': [
],
u'device_id': u'6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',
u'name': u'',
u'admin_state_up': True,
u'network_id': u'2506f745-6581-4b9a-8dde-8c11ebf1d7cb',
u'tenant_id': u'e364e04c62d845a0ac682782a07712ee',
u'created_at': u'2020-02-25T09:59:28Z',
u'binding:vnic_type': u'normal',
u'ip_allocation': u'immediate'
}
to internal_ports cache _process_internal_ports /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py:583
Based on openvswitch-agent.log it seems that the subnet is deleted
before the port configuration is compleate:
2020-02-25 09:59:29.901 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created
_at=2020-02-25T09:59:28Z,data_plane_status=<?>,description='',device_id='6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',device_owner='network:ha_router_replicate
d_interface',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=30b654b9-0d09-407d-8553-b84c0d36e5ef,mac_address=fa:16:3e:6b:
13:79,name='',network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,project_id='e364e04c62d845a0ac682782a07712ee',qos_policy_id=None,revision_number=5,securit
y=PortSecurity(30b654b9-0d09-407d-8553-b84c0d36e5ef),security_group_ids=set([]),status='DOWN',updated_at=2020-02-25T09:59:29Z) record_resource_update /u
sr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187
2020-02-25 09:59:30.022 107901 DEBUG neutron.agent.resource_cache [req-ee7510d2-69cc-49a6-bdfa-4455d7df47ee 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Subnet deleted: 5561834a-9bf3-41e7-ac87-d2d0eae65ca7 record_resource_delete /usr/lib/python2.7/site-packages/neutr
on/agent/resource_cache.py:197
2020-02-25 09:59:30.436 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Port 30b654b9-0d09-407d-8553-b84c0d36e5ef updated (revision_number 5->7). Old fields: {'fixed_ips': [IPAllocation(ip_address=10.108.108.1,network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,port_id=30b654b9-0d09-407d-8553-b84c0d36e5ef,subnet_id=5561834a-9bf3-41e7-ac87-d2d0eae65ca7)]} New fields: {'fixed_ips': []} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185
Version-Release number of selected component (if applicable):
OpenStack-13.0-RHEL-7-20200214.1
Steps to Reproduce:
openstack subnet pool create --pool-prefix 10.108.108.0/24 the_new_subnet_pool
openstack network create the_new_network_1
openstack router create the_new_router
for i in {1..10};
do
openstack subnet create --subnet-pool the_new_subnet_pool
--prefix-length 27 --network the_new_network_1 the_new_subnet_1 &
openstack router add subnet the_new_router the_new_subnet_1 &
openstack router remove subnet the_new_router the_new_subnet_1 &
openstack subnet delete the_new_subnet_1 &
sleep 2
done
The issue causes the following errors:
1. All the interfaces are removed from router's namespace
2. Can't assign new subnets/ports to the router
3. Can't delete router
** Affects: neutron
Importance: Medium
Assignee: Slawek Kaplonski (slaweq)
Status: Confirmed
** Tags: l3-dvr-backlog
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1865891
Title:
Race condition during removal of subnet from the router and removal of
subnet
Status in neutron:
Confirmed
Bug description:
Bug originally reported in
https://bugzilla.redhat.com/show_bug.cgi?id=1806963 but I was also
able to reproduce it on master branch.
Original bug description:
I tried to perform the following actions in background:
1. Create subnet from pool
2. Attach subnet to router
3. Detach subnet from router
4. Delete subnet
5. Sleep 2 seconds
6. GOTO 1
It failed with one of the following errors in l3-agent.log:
[-] Error while deleting router 9935b2d9-65af-4d5e-b0d4-7988cd638e66:
KeyError: 'subnets'
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 385, in _safe_router_removed
self._router_removed(router_id)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 404, in _router_removed
ri.delete()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 459, in delete
super(HaRouter, self).delete()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 421, in delete
self.process_delete()
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
self.logger(e)
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1164, in process_delete
self._process_internal_ports()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 575, in _process_internal_ports
for subnet in p['subnets']:
KeyError: 'subnets'
[-] Failed to process compatible router:
9935b2d9-65af-4d5e-b0d4-7988cd638e66: KeyError: 'mtu'
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 628, in _process_routers_if_compatible
self._process_router_if_compatible(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 486, in _process_router_if_compatible
self._process_updated_router(router)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
line 527, in _process_updated_router
ri.process()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py",
line 474, in process
super(HaRouter, self).process()
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 165, in call
self.logger(e)
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line
196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/neutron/common/utils.py",
line 162, in call
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 1181, in process
self._process_internal_ports()
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 567, in _process_internal_ports
internal_ports)
File "/usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
line 515, in _get_updated_ports
mtu_changed = existing_port['mtu'] != current_port['mtu']
KeyError: 'mtu'
In l3-agent.log file I can see that there is no information about subnet and IP address:
(output is formatted)
2020-02-25 09:59:33.846 875552 DEBUG neutron.agent.l3.router_info [-] appending port
{
u'allowed_address_pairs': [
],
u'extra_dhcp_opts': [
],
u'updated_at': u'2020-02-25T09:59:33Z',
u'device_owner': u'network:ha_router_replicated_interface',
u'revision_number': 11,
u'port_security_enabled': False,
u'binding:profile': {
},
u'fixed_ips': [
],
u'id': u'30b654b9-0d09-407d-8553-b84c0d36e5ef',
u'security_groups': [
],
u'binding:vif_details': {
u'port_filter': True,
u'datapath_type': u'system',
u'ovs_hybrid_plug': True
},
u'binding:vif_type': u'ovs',
u'qos_policy_id': None,
u'mac_address': u'fa:16:3e:6b:13:79',
u'project_id': u'e364e04c62d845a0ac682782a07712ee',
u'status': u'DOWN',
u'binding:host_id': u'controller-0.redhat.local',
u'description': u'',
u'tags': [
],
u'device_id': u'6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',
u'name': u'',
u'admin_state_up': True,
u'network_id': u'2506f745-6581-4b9a-8dde-8c11ebf1d7cb',
u'tenant_id': u'e364e04c62d845a0ac682782a07712ee',
u'created_at': u'2020-02-25T09:59:28Z',
u'binding:vnic_type': u'normal',
u'ip_allocation': u'immediate'
}
to internal_ports cache _process_internal_ports /usr/lib/python2.7/site-packages/neutron/agent/l3/router_info.py:583
Based on openvswitch-agent.log it seems that the subnet is deleted
before the port configuration is compleate:
2020-02-25 09:59:29.901 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Received new resource Port: Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created
_at=2020-02-25T09:59:28Z,data_plane_status=<?>,description='',device_id='6b7a42d0-12ba-4e07-aa4b-3e58f11974f6',device_owner='network:ha_router_replicate
d_interface',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=30b654b9-0d09-407d-8553-b84c0d36e5ef,mac_address=fa:16:3e:6b:
13:79,name='',network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,project_id='e364e04c62d845a0ac682782a07712ee',qos_policy_id=None,revision_number=5,securit
y=PortSecurity(30b654b9-0d09-407d-8553-b84c0d36e5ef),security_group_ids=set([]),status='DOWN',updated_at=2020-02-25T09:59:29Z) record_resource_update /u
sr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:187
2020-02-25 09:59:30.022 107901 DEBUG neutron.agent.resource_cache [req-ee7510d2-69cc-49a6-bdfa-4455d7df47ee 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Subnet deleted: 5561834a-9bf3-41e7-ac87-d2d0eae65ca7 record_resource_delete /usr/lib/python2.7/site-packages/neutr
on/agent/resource_cache.py:197
2020-02-25 09:59:30.436 107901 DEBUG neutron.agent.resource_cache [req-c5a7718c-c4d6-4dbf-a43b-286f2fb09956 9c16552bff264e21a01ba4b3e8ba0d90 e364e04c62d
845a0ac682782a07712ee - - -] Resource Port 30b654b9-0d09-407d-8553-b84c0d36e5ef updated (revision_number 5->7). Old fields: {'fixed_ips': [IPAllocation(ip_address=10.108.108.1,network_id=2506f745-6581-4b9a-8dde-8c11ebf1d7cb,port_id=30b654b9-0d09-407d-8553-b84c0d36e5ef,subnet_id=5561834a-9bf3-41e7-ac87-d2d0eae65ca7)]} New fields: {'fixed_ips': []} record_resource_update /usr/lib/python2.7/site-packages/neutron/agent/resource_cache.py:185
Version-Release number of selected component (if applicable):
OpenStack-13.0-RHEL-7-20200214.1
Steps to Reproduce:
openstack subnet pool create --pool-prefix 10.108.108.0/24 the_new_subnet_pool
openstack network create the_new_network_1
openstack router create the_new_router
for i in {1..10};
do
openstack subnet create --subnet-pool the_new_subnet_pool
--prefix-length 27 --network the_new_network_1 the_new_subnet_1 &
openstack router add subnet the_new_router the_new_subnet_1 &
openstack router remove subnet the_new_router the_new_subnet_1 &
openstack subnet delete the_new_subnet_1 &
sleep 2
done
The issue causes the following errors:
1. All the interfaces are removed from router's namespace
2. Can't assign new subnets/ports to the router
3. Can't delete router
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1865891/+subscriptions
Follow ups