← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1745072] Re: Xenapi: Migration failure of Volume Backed VHDs

 

** Also affects: nova/queens
   Importance: Undecided
       Status: New

** Also affects: nova/rocky
   Importance: Undecided
       Status: New

** Changed in: nova/queens
       Status: New => Confirmed

** Changed in: nova/rocky
       Status: New => Confirmed

** Changed in: nova/rocky
   Importance: Undecided => Low

** Changed in: nova/queens
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1745072

Title:
  Xenapi: Migration failure of Volume Backed VHDs

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  In Progress

Bug description:
  Description
  ===========
     Current Xenapi Resize code in _process_ephemeral_chain_recursive is attempting to migrate Volume backed VHD chains rather than detach these from the source hypervisor and re-attach to the destination.  This code is triggered due to the possibility of ephemeral drives existing and needing to have their VHD chains migrated over after the initial set of base VHD Chains.  This attempt to snapshot the VDI of a Volume Backed drive results in SR_OPERATION_UNSUPPORTED being thrown by XenAPI and the migration failing, as these VDIs do not have the allowed_operation to handle this, and additionally, the VHD that is associated with this VDI is simply a stub.

  Steps to reproduce
  ==================
  1. Create a server of any size or flavor
  2. Attach enough volumes to create VBD Userdevice=4 or greater.
  3. Attempt to migrate the server (Not live-migration)
  4. Migration will fail during ephemeral snapshot process.

  Expected result
  ===============
  During a migration, the volumes should be detached from the source server and then attached to the destination without any real attempt to "migrate" them beyond switching their connection points.

  Actual result
  =============
  Current resize migration code detects all volumes with VBD Userdevice 4+ as an ephemeral drive and attempts to snapshot and migrate it's VHD, causing errors when this is actually a volume backed drive.

  Environment
  ===========
  1. Exact version of OpenStack you are running:  
     
     Liberty, Issue exists in current however

  2. Which hypervisor did you use?
     
     Xenserver Hypervisors.  All versions 6.0+

  2. Which storage type did you use?
    
     Local SSD RAID10 + iSCSI NAS

  3. Which networking type did you use?
     
     Neutron with OpenVSwitch

  Logs & Configs
  ==============

  Please note the logs here from Compute and Xenapi are from two
  actually different samples, but will present in the same way.

  ----------------------------- 
  - Logs from Nova-Compute:
  -----------------------------

  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 212, in inner
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     rv = f(*args, **kwargs)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 1205, in transfer_ephemeral_disks_then_all_leaf_vdis
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     _process_ephemeral_chain_recursive(ephemeral_chains, [])
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 1170, in _process_ephemeral_chain_recursive
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     vm_ref, label, str(userdevice)) as chain_vdi_uuids:
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return self.gen.next()
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vm_utils.py", line 748, in _snapshot_attached_here_impl
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     snapshot_ref = _vdi_snapshot(session, vm_vdi_ref)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vm_utils.py", line 646, in _vdi_snapshot
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return session.call_xenapi("VDI.snapshot", vdi_ref, {})
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/client/session.py", line 212, in call_xenapi
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return session.xenapi_request(method, args)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/XenAPI.py", line 133, in xenapi_request
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     result = _parse_result(getattr(self, methodname)(*full_params))
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/XenAPI.py", line 203, in _parse_result
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     raise Failure(result['ErrorDescription'])
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415] Failure: ['SR_OPERATION_NOT_SUPPORTED', 'OpaqueRef:33dd3d47-abd4-0be7-e83b-a1ffd290292a']
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]
  2015-11-23 20:44:59.502 10060 INFO nova.compute.manager [req-c17b3702-077c-40bb-9487-5598b6ac0a9a 3742 391232 - - -] [instance: df05e626-246b-4676-990f-d67bcc9c0415] Setting instance back to ACTIVE after: Instance rollback performed due to: ['SR_OPERATION_NOT_SUPPORTED', 'OpaqueRef:33dd3d47-abd4-0be7-e83b-a1ffd290292a']

  ------------------------------
  - Logs from Xensource.log
  ------------------------------

  /var/log/xensource.log:Oct 12 02:46:20 localhost xapi: [debug|24-46-53-471027|36242831 INET 0.0.0.0:80|VDI.snapshot R:a42bf8a64bd2|xapi] Caught exception while SR_OPERATION_NOT_SUPPORTED: [ OpaqueRef:8c63e240-d80c-4354-4a58-d0395abf7425 ] in message forwarder: marking SR for VDI.snapshot
  /var/log/xensource.log:Oct 12 02:46:20 localhost xapi: [debug|24-46-53-471027|36242831 INET 0.0.0.0:80|VDI.snapshot R:a42bf8a64bd2|dispatcher] Server_helpers.exec exception_handler: Got exception SR_OPERATION_NOT_SUPPORTED: [ OpaqueRef:8c63e240-d80c-4354-4a58-d0395abf7425 ]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1745072/+subscriptions


References