← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1707699] [NEW] resource cache can push in stale data on server query

 

Public bug reported:

There is a race condition in the push notifications resource cache where
a query to the server can wipe out newer data received for a given
resource.

Order of operations:

* resource_cache issues bulk_pull request to server
* server starts to build response in one thread
* server updates one of the resources in another thread and pushes out the resource onto the queue
* client receives the updated resource and updates the cache
* bulk_pull finally returns and wipes out the newer version of the object with the one in the response.


Snip from http://logs.openstack.org/22/410422/27/gate/gate-tempest-dsvm-py35-ubuntu-xenial/436680c/logs/screen-q-agt.txt.gz on relevant messages showing stale data:


Jul 31 15:09:43.914823 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: DEBUG neutron.agent.resource_cache [None req-d91a8fc6-2545-49e7-a7aa-90f764483e5e service neutron] Resource Port 680f28df-4950-4af0-adf6-67f03d096a46 updated (revision_number 4->5). Old fields: {'binding_levels': [], 'binding': PortBinding(host='ubuntu-xenial-internap-mtl01-10175685',port_id=680f28df-4950-4af0-adf6-67f03d096a46,profile={},status='ACTIVE',vif_details=None,vif_type='unbound',vnic_type='normal')} New fields: {'binding_levels': [PortBindingLevel(driver='openvswitch',host='ubuntu-xenial-internap-mtl01-10175685',level=0,port_id=680f28df-4950-4af0-adf6-67f03d096a46,segment=NetworkSegment(7fff9c50-c9c8-47f5-87ad-3a1aed38f02c))], 'binding': PortBinding(host='ubuntu-xenial-internap-mtl01-10175685',port_id=680f28df-4950-4af0-adf6-67f03d096a46,profile={},status='ACTIVE',vif_details={"ovs_hybrid_plug": true, "datapath_type": "system", "port_filter": true},vif_type='ovs',vnic_type='normal')} {{(pid=19186) record_resource_update /opt/stack/new/neutron/neutron/agent/resource_cache.py:180}}
Jul 31 15:09:44.180728 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: DEBUG neutron.agent.resource_cache [None req-b9f3b846-1d32-43dd-acd7-c68fc5da9a87 None None] 19 resources returned for queries {('Port', ('security_group_ids', ('76bbd669-2595-4e44-bd7d-cb1f5096cb28',)))} {{(pid=19186) _flood_cache_for_query /opt/stack/new/neutron/neutron/agent/resource_cache.py:80}}
Jul 31 15:09:45.308162 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: WARNING neutron.agent.rpc [None req-b9f3b846-1d32-43dd-acd7-c68fc5da9a87 None None] Device Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created_at=2017-07-31T15:09:39Z,data_plane_status=<?>,description='',device_id='c139f8f1-0be7-4a2e-bcbe-50df320effcb',device_owner='compute:nova',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=680f28df-4950-4af0-adf6-67f03d096a46,mac_address=fa:16:3e:12:5d:f7,name='',network_id=96ad3ed5-7530-4e26-a0a9-99abe3157e0c,project_id='330410e9ec0044f494be2def53f589b1',qos_policy_id=None,revision_number=4,security=PortSecurity(680f28df-4950-4af0-adf6-67f03d096a46),security_group_ids=set([76bbd669-2595-4e44-bd7d-cb1f5096cb28]),status='DOWN',updated_at=2017-07-31T15:09:43Z) is not bound.

** Affects: neutron
     Importance: High
     Assignee: Kevin Benton (kevinbenton)
         Status: In Progress

** Changed in: neutron
   Importance: Undecided => High

** Changed in: neutron
     Assignee: (unassigned) => Kevin Benton (kevinbenton)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1707699

Title:
  resource cache can push in stale data on server query

Status in neutron:
  In Progress

Bug description:
  There is a race condition in the push notifications resource cache
  where a query to the server can wipe out newer data received for a
  given resource.

  Order of operations:

  * resource_cache issues bulk_pull request to server
  * server starts to build response in one thread
  * server updates one of the resources in another thread and pushes out the resource onto the queue
  * client receives the updated resource and updates the cache
  * bulk_pull finally returns and wipes out the newer version of the object with the one in the response.

  
  Snip from http://logs.openstack.org/22/410422/27/gate/gate-tempest-dsvm-py35-ubuntu-xenial/436680c/logs/screen-q-agt.txt.gz on relevant messages showing stale data:


  Jul 31 15:09:43.914823 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: DEBUG neutron.agent.resource_cache [None req-d91a8fc6-2545-49e7-a7aa-90f764483e5e service neutron] Resource Port 680f28df-4950-4af0-adf6-67f03d096a46 updated (revision_number 4->5). Old fields: {'binding_levels': [], 'binding': PortBinding(host='ubuntu-xenial-internap-mtl01-10175685',port_id=680f28df-4950-4af0-adf6-67f03d096a46,profile={},status='ACTIVE',vif_details=None,vif_type='unbound',vnic_type='normal')} New fields: {'binding_levels': [PortBindingLevel(driver='openvswitch',host='ubuntu-xenial-internap-mtl01-10175685',level=0,port_id=680f28df-4950-4af0-adf6-67f03d096a46,segment=NetworkSegment(7fff9c50-c9c8-47f5-87ad-3a1aed38f02c))], 'binding': PortBinding(host='ubuntu-xenial-internap-mtl01-10175685',port_id=680f28df-4950-4af0-adf6-67f03d096a46,profile={},status='ACTIVE',vif_details={"ovs_hybrid_plug": true, "datapath_type": "system", "port_filter": true},vif_type='ovs',vnic_type='normal')} {{(pid=19186) record_resource_update /opt/stack/new/neutron/neutron/agent/resource_cache.py:180}}
  Jul 31 15:09:44.180728 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: DEBUG neutron.agent.resource_cache [None req-b9f3b846-1d32-43dd-acd7-c68fc5da9a87 None None] 19 resources returned for queries {('Port', ('security_group_ids', ('76bbd669-2595-4e44-bd7d-cb1f5096cb28',)))} {{(pid=19186) _flood_cache_for_query /opt/stack/new/neutron/neutron/agent/resource_cache.py:80}}
  Jul 31 15:09:45.308162 ubuntu-xenial-internap-mtl01-10175685 neutron-openvswitch-agent[19186]: WARNING neutron.agent.rpc [None req-b9f3b846-1d32-43dd-acd7-c68fc5da9a87 None None] Device Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created_at=2017-07-31T15:09:39Z,data_plane_status=<?>,description='',device_id='c139f8f1-0be7-4a2e-bcbe-50df320effcb',device_owner='compute:nova',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=680f28df-4950-4af0-adf6-67f03d096a46,mac_address=fa:16:3e:12:5d:f7,name='',network_id=96ad3ed5-7530-4e26-a0a9-99abe3157e0c,project_id='330410e9ec0044f494be2def53f589b1',qos_policy_id=None,revision_number=4,security=PortSecurity(680f28df-4950-4af0-adf6-67f03d096a46),security_group_ids=set([76bbd669-2595-4e44-bd7d-cb1f5096cb28]),status='DOWN',updated_at=2017-07-31T15:09:43Z) is not bound.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1707699/+subscriptions


Follow ups