← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1651650] Re: XenAPI: server rescue test sometime failed with timeout waiting for vif plugging

 

Reviewed:  https://review.openstack.org/413469
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2207dcf560413b213a8fb3737bb4b0923dcd96e0
Submitter: Jenkins
Branch:    master

commit 2207dcf560413b213a8fb3737bb4b0923dcd96e0
Author: Huan Xie <huan.xie@xxxxxxxxxx>
Date:   Tue Dec 20 23:26:49 2016 -0800

    XenAPI: Fix vif plug problem during VM rescue/unrescue
    
    During VM rescue tests, we found nova xenserver driver failed due
    to waiting vif-plug-event from neutron timeout. when checking
    nova and neutron logs, we found there are several mistakes in
    nova driver:
    (1) After several rounds of rescuing/unrescuing, it will wait for
    vif-plug-event, but actually, it shouldn't wait for such event
    (2) Checking neutron log, we found the port status sometimes will
    change during rescuing/unrescuing, which also shouldn't happen
    (3) Checking nova related code, we found each time when booting a
    VM, it will delete and create the tap device, which is used by
    neutron security group, this delete/re-create action will cause
    the port status change which shouldn't be changed.
    (4) When adding/deleting security groups to VM's port, it will
    trigger the port status change, e.g. from ACTIVE to BUILDING, but
    under rescue scenario, we also depends on VIF's status to determine
    whether waiting for vif plug event is not appropriate.
    
    This patch is to fix the above problem and there is another patch
    to enable the exclude rescue tests to test this fix
    https://review.openstack.org/#/c/416197/
    
    Closes-Bug: #1651650
    
    Change-Id: I32c66733330bc9877caea7e2a2290c02b3906708


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1651650

Title:
  XenAPI: server rescue test sometime failed with timeout waiting for
  vif plugging

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Observed several failure in citrix xenserver CI for this test case:
  tempest.api.compute.servers.test_server_rescue

  See there are timeout waiting for vif:

  $ grep 'Timeout waiting for vif plugging callbac' screen-n-cpu.txt.gz
  2016-12-20 10:58:52.036 4528 WARNING nova.virt.xenapi.vmops [req-ff027cef-59be-4326-95e1-065f68077d63 tempest-ServerRescueTestJSON-1293983176 tempest-ServerRescueTestJSON-1293983176] [instance: 28b094ee-c571-4083-b72b-5ea78f1f4291] Timeout waiting for vif plugging callback

  For rescue, it seems shouldn't wait for this event as this port should be active at the rescuing start.
  But observed:
  neutron service reported the 2nd vif-plugin event:

  
  2016-12-20 10:52:31.689 712 DEBUG neutron.notifiers.nova [-] Sending events: [{'status': 'completed', 'tag': u'52d79a78-7205-4e69-8005-76a3cebbf267', 'name': 'network-vif-plugged', 'server_uuid': u'28b094ee-c571-4083-b72b-5ea78f1f4291'}] send_events /opt/stack/new/neutron/neutron/notifiers/nova.py:248

  2016-12-20 10:53:45.179 712 DEBUG neutron.notifiers.nova [-] Sending
  events: [{'status': 'completed', 'tag':
  u'52d79a78-7205-4e69-8005-76a3cebbf267', 'name': 'network-vif-
  plugged', 'server_uuid': u'28b094ee-c571-4083-b72b-5ea78f1f4291'}]
  send_events /opt/stack/new/neutron/neutron/notifiers/nova.py:248

  
  And nova attempts to wait for this event after the 2nd event sent out; so it won't catch the 2nd event at all:
  2016-12-20 10:53:46.326 4528 DEBUG nova.virt.xenapi.vmops [req-ff027cef-59be-4326-95e1-065f68077d63 tempest-ServerRescueTestJSON-1293983176 tempest-ServerRescueTestJSON-1293983176] wait for instance event:[('network-vif-plugged', u'52d79a78-7205-4e69-8005-76a3cebbf267')] _spawn /opt/stack/new/nova/nova/virt/xenapi/vmops.py:599

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1651650/+subscriptions


References