← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1694371] Re: Timeout while waiting for network-vif-plugged event during server rebuild

 

Reviewed:  https://review.openstack.org/473685
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0d7952400be3e60973a30fae76a82ec5c8259ffc
Submitter: Jenkins
Branch:    master

commit 0d7952400be3e60973a30fae76a82ec5c8259ffc
Author: Kevin Benton <kevin@xxxxxxxxxx>
Date:   Mon Jun 12 21:56:09 2017 -0700

    Trigger port status DOWN on VIF replug
    
    Now with the merge of push notifications, processing a port update
    no longer automatically implies a transition from ACTIVE to BUILD
    to ACTIVE again.
    
    This resulted in a bug where Nova would unplug and replug an interface
    quickly during rebuild and it would never get a vif-plugged event.
    Nothing in the data model was actually being updated that resulted in
    the status being set to DOWN or BUILD and the port would return before
    the agent would process it as a removed port to mark it as DOWN.
    
    This fixes the bug by making the agent force the port to DOWN whenever
    it loses its VLAN. Watching for the VLAN loss was already introduced
    to detect these fast unplug/plug events before so this just adds the
    status update.
    
    Closes-Bug: #1694371
    Change-Id: Ice24eea2534fd6f3b103ec014218a65a45492b1f


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1694371

Title:
  Timeout while waiting for network-vif-plugged event during server
  rebuild

Status in heat:
  Invalid
Status in neutron:
  Fix Released
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  It seems like the server is going to ERROR state during rebuild.

  traceback:

  2017-05-28 04:34:37.767757 | 2017-05-28 04:34:37.767 | Captured traceback:
  2017-05-28 04:34:37.768828 | 2017-05-28 04:34:37.768 | ~~~~~~~~~~~~~~~~~~~
  2017-05-28 04:34:37.770099 | 2017-05-28 04:34:37.769 |     b'Traceback (most recent call last):'
  2017-05-28 04:34:37.771108 | 2017-05-28 04:34:37.770 |     b'  File "/opt/stack/new/heat/heat_integrationtests/functional/test_snapshot_restore.py", line 74, in test_stack_snapshot_restore'
  2017-05-28 04:34:37.772364 | 2017-05-28 04:34:37.772 |     b'    self.stack_restore(stack_identifier, snapshot_id)'
  2017-05-28 04:34:37.773407 | 2017-05-28 04:34:37.773 |     b'  File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 626, in stack_restore'
  2017-05-28 04:34:37.774541 | 2017-05-28 04:34:37.774 |     b'    self._wait_for_stack_status(stack_id, wait_for_status)'
  2017-05-28 04:34:37.775568 | 2017-05-28 04:34:37.775 |     b'  File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 357, in _wait_for_stack_status'
  2017-05-28 04:34:37.776642 | 2017-05-28 04:34:37.776 |     b'    fail_regexp):'
  2017-05-28 04:34:37.778354 | 2017-05-28 04:34:37.778 |     b'  File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 321, in _verify_status'
  2017-05-28 04:34:37.779448 | 2017-05-28 04:34:37.779 |     b'    stack_status_reason=stack.stack_status_reason)'
  2017-05-28 04:34:37.780644 | 2017-05-28 04:34:37.780 |     b"heat_integrationtests.common.exceptions.StackBuildErrorException: Stack StackSnapshotRestoreTest-1374582671/7fb8f800-1545-4e34-a6fa-3e2adbf4443a is in RESTORE_FAILED status due to 'Error: resources.my_server: Rebuilding server failed, status 'ERROR''"
  2017-05-28 04:34:37.782119 | 2017-05-28 04:34:37.781 |     b''

  
  Noticed at:

  http://logs.openstack.org/16/462216/16/check/gate-heat-dsvm-
  functional-convg-mysql-lbaasv2-py35-ubuntu-
  xenial/17c2da9/console.html#_2017-05-28_04_34_37_753094

  
  Looks like a nova issue from the below traceback.

  http://logs.openstack.org/16/462216/16/check/gate-heat-dsvm-
  functional-convg-mysql-lbaasv2-py35-ubuntu-
  xenial/17c2da9/logs/screen-n-cpu.txt.gz?level=ERROR#_May_28_04_14_49_044455

  
  May 28 04:14:49.042877 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/compute/manager.py", line 6758, in _error_out_instance_on_exception
  May 28 04:14:49.042955 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     yield
  May 28 04:14:49.043027 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2814, in rebuild_instance
  May 28 04:14:49.043100 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     bdms, recreate, on_shared_storage, preserve_ephemeral)
  May 28 04:14:49.043197 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2855, in _do_rebuild_instance_with_claim
  May 28 04:14:49.043299 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     self._do_rebuild_instance(*args, **kwargs)
  May 28 04:14:49.043384 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2977, in _do_rebuild_instance
  May 28 04:14:49.043458 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     self._rebuild_default_impl(**kwargs)
  May 28 04:14:49.043534 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/compute/manager.py", line 2714, in _rebuild_default_impl
  May 28 04:14:49.043611 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     block_device_info=new_block_device_info)
  May 28 04:14:49.044132 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2741, in spawn
  May 28 04:14:49.044223 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     destroy_disks_on_failure=True)
  May 28 04:14:49.044299 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]   File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5152, in _create_domain_and_network
  May 28 04:14:49.044374 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282]     raise exception.VirtualInterfaceCreateException()
  May 28 04:14:49.044455 ubuntu-xenial-osic-cloud1-s3700-9024798 nova-compute[26709]: ERROR nova.compute.manager [instance: 45105d34-b970-4ced-968c-a1c4ead5b282] nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/heat/+bug/1694371/+subscriptions