← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1693315] Re: Unhelpful invalid bdm error in compute logs when volume creation fails during boot from volume

 

Reviewed:  https://review.openstack.org/467715
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=20c4715a49a44c642882618f102cd0fc9342978d
Submitter: Jenkins
Branch:    master

commit 20c4715a49a44c642882618f102cd0fc9342978d
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Thu Jun 15 11:46:44 2017 -0400

    Provide original fault message when BFV fails
    
    When booting from volume and Nova is creating the volume,
    it can fail (timeout, invalid AZ in Cinder, etc) and the
    generic Exception handling in _prep_block_device will log
    the original exception trace but then raise a generic
    InvalidBDM exception, which is handled higher up and converted
    to a BuildAbortException, which is recorded as an instance
    fault, but the original error message is lost from the fault.
    
    It would be better to include the original exception message that
    triggered the failure so that goes into the fault for debug.
    
    For example, this is a difference of getting an error like this:
    
      BuildAbortException: Build of instance
      9484f5a7-3198-47ff-b728-178515a26277 aborted:
      Block Device Mapping is Invalid.
    
    To something more useful like this:
    
      BuildAbortException: Build of instance
      9484f5a7-3198-47ff-b728-178515a26277 aborted:
      Volume da947c97-66c6-4b7e-9ae6-54eb8128bb75 did not finish
      being created even after we waited 3 seconds or 2 attempts.
      And its status is error.
    
    Change-Id: I20a5e8e5e10dd505c1b24c208f919c6550e9d1a4
    Closes-Bug: #1693315


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693315

Title:
  Unhelpful invalid bdm error in compute logs when volume creation fails
  during boot from volume

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Confirmed

Bug description:
  This came up in IRC while debugging a separate problem with a user.

  They are booting from volume where nova creates the volume, and were
  getting this unhelpful error message in the end:

  BuildAbortException: Build of instance
  9484f5a7-3198-47ff-b728-178515a26277 aborted: Block Device Mapping is
  Invalid.

  That's from this generic exception that is raised up:

  https://github.com/openstack/nova/blob/81bdbd0b50aeac9a677a0cef9001081008a2c407/nova/compute/manager.py#L1595

  The actual exception in the traceback is much more specific:

  http://paste.as47869.net/p/9qbburh7z3w3toi

  2017-05-24 16:33:26.127 2331 ERROR nova.compute.manager [instance:
  9484f5a7-3198-47ff-b728-178515a26277] VolumeNotCreated: Volume
  da947c97-66c6-4b7e-9ae6-54eb8128bb75 did not finish being created even
  after we waited 3 seconds or 2 attempts. And its status is error.

  That's showing that the volume failed to be created almost
  immediately.

  It would be better to include that error message in what goes into the
  BuildAbortException which is what ultimately goes into the recorded
  instance fault:

  https://github.com/openstack/nova/blob/81bdbd0b50aeac9a677a0cef9001081008a2c407/nova/compute/manager.py#L1878

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693315/+subscriptions


References