← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1532171] [NEW] lb: hard reboot or destroy of vm can lead to error log and agent resync

 

Public bug reported:

A tap device can suddenly disappear e.g. due to instance destroy. If the
agent is in the midst of processing a a device update for this tap
device (e.g. due to a security group update), the agent logs the
following errors:

2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o', 'link', 'show', 'tapa0084edd-d4'] create_process /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device: tapa0084edd-d4 does not exist on this host, skipped add_tap_interface /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Setting admin_state_up to True for port a0084edd-d437-4ff0-b2e7-7cfd93ea34c4 ensure_port_admin_state /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:686
2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"

2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent

The logs [1][2] show two scenarios where this happens:
- hard reboot of an instance (instance is destroyed and defined again)
- destroy of an instance

After that , the agent triggers a resync and gets everything fine again

The solution could be to check for the device existence in
ensure_port_admin_state method. Maybe we should check if there's a more
elegant way to do it to avoid the resync...

[1] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-n-cpu.txt.gz#_2016-01-07_17_43_50_991
[2] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-q-agt.txt.gz#_2016-01-07_17_43_45_961

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: linuxbridge

** Tags added: linuxbridge

** Description changed:

  A tap device can suddenly disappear e.g. due to instance destroy. If the
  agent is in the midst of processing a a device update for this tap
  device (e.g. due to a security group update), the agent logs the
  following errors:
- 
  
  2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o', 'link', 'show', 'tapa0084edd-d4'] create_process /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device: tapa0084edd-d4 does not exist on this host, skipped add_tap_interface /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Setting admin_state_up to True for port a0084edd-d437-4ff0-b2e7-7cfd93ea34c4 ensure_port_admin_state /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:686
  2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
  2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
  
  2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent 
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent 
- 
+ 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
+ 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
  
  The logs [1][2] show two scenarios where this happens:
  - hard reboot of an instance (instance is destroyed and defined again)
  - destroy of an instance
  
  After that , the agent triggers a resync and gets everything fine again
  
+ The solution could be to check for the device existence in
+ ensure_port_admin_state method.
  
- The solution could be to check for the device existence in ensure_port_admin_state method.
- 
- 
- More details on the hard reboot scenario :
- 
- Taken from n-cpu log [1]
- 
- 2016-01-07 17:43:50.991 INFO nova.compute.manager [req-03e0a01a-
- 79d4-49a4-bac7-029f61ef636f tempest-SecurityGroupsTestJSON-1738577843
- tempest-SecurityGroupsTestJSON-577742420] [instance:
- 117d7e75-06a4-4f22-a650-2d10d10e2eab] Rebooting instance
- 
- 2016-01-07 17:43:52.674 24434 INFO nova.virt.libvirt.driver [-]
- [instance: 117d7e75-06a4-4f22-a650-2d10d10e2eab] Instance destroyed
- successfully.
- 
- 2016-01-07 17:43:52.676 DEBUG nova.virt.libvirt.driver [req-03e0a01a-
- 79d4-49a4-bac7-029f61ef636f tempest-SecurityGroupsTestJSON-1738577843
- tempest-SecurityGroupsTestJSON-577742420] [instance:
- 117d7e75-06a4-4f22-a650-2d10d10e2eab] Start _get_guest_xml
- network_info=[VIF({'profile': {}, 'ovs_interfaceid': None,
- 'preserve_on_delete': False, 'network': Network({'bridge':
- u'brq84865936-21', 'subnets': [Subnet({'ips': [FixedIP({'meta': {},
- 'version': 4, 'type': 'fixed', 'floating_ips': [], 'address':
- u'10.100.0.2'})], 'version': 4, 'meta': {}, 'dns': [], 'routes': [],
- 'cidr': u'10.100.0.0/28', 'gateway': IP({'meta': {}, 'version': 4,
- 'type': 'gateway', 'address': u'10.100.0.1'})})], 'meta': {'injected':
- False, 'tenant_id': u'ca9d80fa18624786a7d7fe48f3bc5276',
- 'should_create_bridge': True}, 'id':
- u'84865936-2126-4b24-9419-f76e88f9b514', 'label': u'tempest-
- SecurityGroupsTestJSON-1084751823-network'}), 'devname':
- u'tapa0084edd-d4', 'vnic_type': u'normal', 'qbh_params': None, 'meta':
- {}, 'details': {u'port_filter': True}, 'address': u'fa:16:3e:68:5b:0f',
- 'active': True, 'type': u'bridge', 'id':
- u'a0084edd-d437-4ff0-b2e7-7cfd93ea34c4', 'qbg_params': None})]
- disk_info={'disk_bus': 'virtio', 'cdrom_bus': 'ide', 'mapping':
- {'disk.config': {'bus': 'ide', 'type': 'cdrom', 'dev': 'hdd'}, 'disk':
- {'bus': 'virtio', 'boot_index': '1', 'type': 'disk', 'dev': u'vda'},
- 'root': {'bus': 'virtio', 'boot_index': '1', 'type': 'disk', 'dev':
- u'vda'}}}
- image_meta=ImageMeta(checksum=<?>,container_format='ami',created_at=<?>,direct_url=<?>,disk_format='ami',id=<?>,min_disk=0,min_ram=0,name=<?>,owner=<?>,properties=ImageMetaProps,protected=<?>,size=<?>,status=<?>,tags=<?>,updated_at=<?>,virtual_size=<?>,visibility=<?>)
- rescue=None block_device_info={'swap': None, 'root_device_name':
- u'/dev/vda', 'ephemerals': [], 'block_device_mapping': []}
- _get_guest_xml /opt/stack/new/nova/nova/virt/libvirt/driver.py:4369
- 
- 2016-01-07 17:43:53.818 INFO nova.compute.manager [req-8d71edf4-a55f-
- 42ed-8adf-710e5854ad58 None None] [instance:
- 117d7e75-06a4-4f22-a650-2d10d10e2eab] VM Started (Lifecycle Event)
- 
- 2016-01-07 17:43:53.822 24434 INFO nova.virt.libvirt.driver [-]
- [instance: 117d7e75-06a4-4f22-a650-2d10d10e2eab] Instance rebooted
- successfully.
- 
- 
- Taken from q-agt log [2]:
- 2016-01-07 17:43:45.961 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Agent loop found changes! {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set(['tapa0084edd-d4']), 'updated': set([])} daemon_loop /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:1189
- 
- 2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-
- 88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o',
- 'link', 'show', 'tapa0084edd-d4'] create_process
- /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
- 
- 2016-01-07 17:43:52.230 DEBUG
- neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
- [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device:
- tapa0084edd-d4 does not exist on this host, skipped add_tap_interface
- /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
- 
- 2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
- 2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
- 2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent 
- 2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent 
- --> Device not existent!
- 
- 2016-01-07 17:43:53.962 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Agent loop found changes! {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'updated': set([u'tapa0084edd-d4'])} daemon_loop /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:1189
- --> Device back again
  
  [1] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-n-cpu.txt.gz#_2016-01-07_17_43_50_991
  [2] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-q-agt.txt.gz#_2016-01-07_17_43_45_961

** Description changed:

  A tap device can suddenly disappear e.g. due to instance destroy. If the
  agent is in the midst of processing a a device update for this tap
  device (e.g. due to a security group update), the agent logs the
  following errors:
  
  2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o', 'link', 'show', 'tapa0084edd-d4'] create_process /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device: tapa0084edd-d4 does not exist on this host, skipped add_tap_interface /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Setting admin_state_up to True for port a0084edd-d437-4ff0-b2e7-7cfd93ea34c4 ensure_port_admin_state /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:686
  2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
  2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
  
  2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
  
  The logs [1][2] show two scenarios where this happens:
  - hard reboot of an instance (instance is destroyed and defined again)
  - destroy of an instance
  
  After that , the agent triggers a resync and gets everything fine again
  
  The solution could be to check for the device existence in
- ensure_port_admin_state method.
- 
+ ensure_port_admin_state method. Maybe we should check if there's a more
+ elegant way to do it to avoid the resync...
  
  [1] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-n-cpu.txt.gz#_2016-01-07_17_43_50_991
  [2] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-q-agt.txt.gz#_2016-01-07_17_43_45_961

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1532171

Title:
  lb: hard reboot or destroy of vm can lead to error log and agent
  resync

Status in neutron:
  New

Bug description:
  A tap device can suddenly disappear e.g. due to instance destroy. If
  the agent is in the midst of processing a a device update for this tap
  device (e.g. due to a security group update), the agent logs the
  following errors:

  2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o', 'link', 'show', 'tapa0084edd-d4'] create_process /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device: tapa0084edd-d4 does not exist on this host, skipped add_tap_interface /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Setting admin_state_up to True for port a0084edd-d437-4ff0-b2e7-7cfd93ea34c4 ensure_port_admin_state /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:686
  2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
  2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"

  2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent

  The logs [1][2] show two scenarios where this happens:
  - hard reboot of an instance (instance is destroyed and defined again)
  - destroy of an instance

  After that , the agent triggers a resync and gets everything fine
  again

  The solution could be to check for the device existence in
  ensure_port_admin_state method. Maybe we should check if there's a
  more elegant way to do it to avoid the resync...

  [1] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-n-cpu.txt.gz#_2016-01-07_17_43_50_991
  [2] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-q-agt.txt.gz#_2016-01-07_17_43_45_961

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1532171/+subscriptions


Follow ups