← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1745072] Re: Xenapi: Migration failure of Volume Backed VHDs

 

Reviewed:  https://review.openstack.org/533168
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eefb20e4658e17f91fa76b74fef6ff899babe51b
Submitter: Zuul
Branch:    master

commit eefb20e4658e17f91fa76b74fef6ff899babe51b
Author: Brooks Kaminski <brooks.kaminski@xxxxxxxxxxxxx>
Date:   Fri Jan 12 06:05:36 2018 -0600

    XenAPI/Stops the migration of volume backed VHDS
    
    This commit aims to correct problems with the resize_up codebase that allows
    the snapshot and migration of volume backed VDI/VHDs.  Since these are empty
    stub disks, and the XenAPI does not allow these VDIs to be snapped, this results
    in an SR_OPERATION_NOT_ALLOWED or similar error on attempt.
    
    This change adds a check into the _process_ephemeral_chain_recursive method to
    run the current userdevice through volume_utils.is_booted_from_volume.  To
    achieve this, the method has been opened in scope to accept custom user_device
    objects.  In a future commit we will need to rename this method for clarity
    and correct its dependancies that call it.  I have added a TODO for this to be
    done by myself. The check will ensure that the userdevice is not volume backed
    and then continue to snapshot and migrate the disk as needed, else increment
    and move on.
    
    Closes-Bug: #1745072
    Change-Id: I7cd2977c8268c1f73062b5d0b2b68ea686db99fe


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1745072

Title:
  Xenapi: Migration failure of Volume Backed VHDs

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========
     Current Xenapi Resize code in _process_ephemeral_chain_recursive is attempting to migrate Volume backed VHD chains rather than detach these from the source hypervisor and re-attach to the destination.  This code is triggered due to the possibility of ephemeral drives existing and needing to have their VHD chains migrated over after the initial set of base VHD Chains.  This attempt to snapshot the VDI of a Volume Backed drive results in SR_OPERATION_UNSUPPORTED being thrown by XenAPI and the migration failing, as these VDIs do not have the allowed_operation to handle this, and additionally, the VHD that is associated with this VDI is simply a stub.

  Steps to reproduce
  ==================
  1. Create a server of any size or flavor
  2. Attach enough volumes to create VBD Userdevice=4 or greater.
  3. Attempt to migrate the server (Not live-migration)
  4. Migration will fail during ephemeral snapshot process.

  Expected result
  ===============
  During a migration, the volumes should be detached from the source server and then attached to the destination without any real attempt to "migrate" them beyond switching their connection points.

  Actual result
  =============
  Current resize migration code detects all volumes with VBD Userdevice 4+ as an ephemeral drive and attempts to snapshot and migrate it's VHD, causing errors when this is actually a volume backed drive.

  Environment
  ===========
  1. Exact version of OpenStack you are running:  
     
     Liberty, Issue exists in current however

  2. Which hypervisor did you use?
     
     Xenserver Hypervisors.  All versions 6.0+

  2. Which storage type did you use?
    
     Local SSD RAID10 + iSCSI NAS

  3. Which networking type did you use?
     
     Neutron with OpenVSwitch

  Logs & Configs
  ==============

  Please note the logs here from Compute and Xenapi are from two
  actually different samples, but will present in the same way.

  ----------------------------- 
  - Logs from Nova-Compute:
  -----------------------------

  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 212, in inner
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     rv = f(*args, **kwargs)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 1205, in transfer_ephemeral_disks_then_all_leaf_vdis
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     _process_ephemeral_chain_recursive(ephemeral_chains, [])
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vmops.py", line 1170, in _process_ephemeral_chain_recursive
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     vm_ref, label, str(userdevice)) as chain_vdi_uuids:
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return self.gen.next()
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vm_utils.py", line 748, in _snapshot_attached_here_impl
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     snapshot_ref = _vdi_snapshot(session, vm_vdi_ref)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/vm_utils.py", line 646, in _vdi_snapshot
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return session.call_xenapi("VDI.snapshot", vdi_ref, {})
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/nova/virt/xenapi/client/session.py", line 212, in call_xenapi
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     return session.xenapi_request(method, args)
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/XenAPI.py", line 133, in xenapi_request
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     result = _parse_result(getattr(self, methodname)(*full_params))
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]   File "/opt/rackstack/rackstack.399.15/nova/lib/python2.7/site-packages/XenAPI.py", line 203, in _parse_result
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]     raise Failure(result['ErrorDescription'])
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415] Failure: ['SR_OPERATION_NOT_SUPPORTED', 'OpaqueRef:33dd3d47-abd4-0be7-e83b-a1ffd290292a']
  2015-11-23 20:44:59.396 10060 ERROR nova.virt.xenapi.vmops [instance: df05e626-246b-4676-990f-d67bcc9c0415]
  2015-11-23 20:44:59.502 10060 INFO nova.compute.manager [req-c17b3702-077c-40bb-9487-5598b6ac0a9a 3742 391232 - - -] [instance: df05e626-246b-4676-990f-d67bcc9c0415] Setting instance back to ACTIVE after: Instance rollback performed due to: ['SR_OPERATION_NOT_SUPPORTED', 'OpaqueRef:33dd3d47-abd4-0be7-e83b-a1ffd290292a']

  ------------------------------
  - Logs from Xensource.log
  ------------------------------

  /var/log/xensource.log:Oct 12 02:46:20 localhost xapi: [debug|24-46-53-471027|36242831 INET 0.0.0.0:80|VDI.snapshot R:a42bf8a64bd2|xapi] Caught exception while SR_OPERATION_NOT_SUPPORTED: [ OpaqueRef:8c63e240-d80c-4354-4a58-d0395abf7425 ] in message forwarder: marking SR for VDI.snapshot
  /var/log/xensource.log:Oct 12 02:46:20 localhost xapi: [debug|24-46-53-471027|36242831 INET 0.0.0.0:80|VDI.snapshot R:a42bf8a64bd2|dispatcher] Server_helpers.exec exception_handler: Got exception SR_OPERATION_NOT_SUPPORTED: [ OpaqueRef:8c63e240-d80c-4354-4a58-d0395abf7425 ]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1745072/+subscriptions


References