← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1506242] [NEW] If instance spawn fails and shutdown_instance also fails, a new excpetion is raised, masking original spawn failure

 

Public bug reported:

When nova-compute, when building and running the instance, calls spawn
on virt driver, spawn can fail for several reasons.

e.g. For Ironic, the spawn call can fail if deploy callback timeout
happens.

If this call fails, nova-compute catches the exception, saves it for re-
raising and calls shutdown_instance in a try-except block [1]. The
problem is, if this shutdown_instance call also fails, a new exception
'BuildAbortException' is raised. This masks the original spawn failure.

This can cause problems for Ironic where, if deployment failed due to
timeout, there is a good chance that shutdown_instance will also fail
due to same reason, since it involves zapping etc. So original
deployment failure will not be propagated back as instance fault.


[1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2171-L2191

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1506242

Title:
  If instance spawn fails and shutdown_instance also fails, a new
  excpetion is raised, masking original spawn failure

Status in OpenStack Compute (nova):
  New

Bug description:
  When nova-compute, when building and running the instance, calls spawn
  on virt driver, spawn can fail for several reasons.

  e.g. For Ironic, the spawn call can fail if deploy callback timeout
  happens.

  If this call fails, nova-compute catches the exception, saves it for
  re-raising and calls shutdown_instance in a try-except block [1]. The
  problem is, if this shutdown_instance call also fails, a new exception
  'BuildAbortException' is raised. This masks the original spawn
  failure.

  This can cause problems for Ironic where, if deployment failed due to
  timeout, there is a good chance that shutdown_instance will also fail
  due to same reason, since it involves zapping etc. So original
  deployment failure will not be propagated back as instance fault.

  
  [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2171-L2191

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1506242/+subscriptions


Follow ups