← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1532171] Re: lb: hard reboot or destroy of vm can lead to error log and agent resync

 

** Changed in: neutron
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1532171

Title:
  lb: hard reboot or destroy of vm can lead to error log and agent
  resync

Status in neutron:
  Fix Released

Bug description:
  A tap device can suddenly disappear e.g. due to instance destroy. If
  the agent is in the midst of processing a a device update for this tap
  device (e.g. due to a security group update), the agent logs the
  following errors:

  2016-01-07 17:43:52.225 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command: ['ip', '-o', 'link', 'show', 'tapa0084edd-d4'] create_process /opt/stack/new/neutron/neutron/agent/linux/utils.py:84
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Tap device: tapa0084edd-d4 does not exist on this host, skipped add_tap_interface /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:409
  2016-01-07 17:43:52.230 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Setting admin_state_up to True for port a0084edd-d437-4ff0-b2e7-7cfd93ea34c4 ensure_port_admin_state /opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:686
  2016-01-07 17:43:52.231 DEBUG neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Running command (rootwrap daemon): ['ip', 'link', 'set', 'tapa0084edd-d4', 'up'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:100
  2016-01-07 17:43:52.263 ERROR neutron.agent.linux.utils [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"

  2016-01-07 17:43:52.263 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-07a4fb1d-88fe-40d7-b0fa-f93d1bac8a34 None None] Error in agent loop. Devices info: {'current': set(['tap3bbccdeb-0d', 'tap2cbadddb-48', 'tap2ff01acc-16', 'tap92ccd364-e1', 'tap1b585b2d-f7', 'tap6838b208-7e', 'tapf03a19db-48', 'tap294b5031-17', 'tapa0084edd-d4', 'tap6457a7f6-65', 'tap91c29239-c1']), 'removed': set([]), 'added': set([]), 'updated': set([u'tapa0084edd-d4'])}
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent Traceback (most recent call last):
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1191, in daemon_loop
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     sync = self.process_network_devices(device_info)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 994, in process_network_devices
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1070, in treat_devices_added_updated
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     device_details['admin_state_up'])
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 689, in ensure_port_admin_state
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     ip_lib.IPDevice(tap_name).link.set_up()
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 461, in set_up
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     return self._as_root([], ('set', self.name, 'up'))
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 321, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     use_root_namespace=use_root_namespace)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 94, in _as_root
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=self.log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 103, in _execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     log_fail_as_error=log_fail_as_error)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent   File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 140, in execute
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent     raise RuntimeError(msg)
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent RuntimeError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "tapa0084edd-d4"
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent
  2016-01-07 17:43:52.263 23166 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent

  The logs [1][2] show two scenarios where this happens:
  - hard reboot of an instance (instance is destroyed and defined again)
  - destroy of an instance

  After that , the agent triggers a resync and gets everything fine
  again

  The solution could be to check for the device existence in
  ensure_port_admin_state method. Maybe we should check if there's a
  more elegant way to do it to avoid the resync...

  [1] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-n-cpu.txt.gz#_2016-01-07_17_43_50_991
  [2] http://logs.openstack.org/18/246318/14/check/gate-tempest-dsvm-neutron-linuxbridge/cdded9a/logs/screen-q-agt.txt.gz#_2016-01-07_17_43_45_961

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1532171/+subscriptions


References