← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1732506] [NEW] Baremetal instance stuck in BUILD state following ironic node tear down or delete

 

Public bug reported:

Description
===========

A baremetal (ironic) instance can become stuck in the BUILD state if the
ironic node to which the instance has been assigned is either deleted or
torn down manually while the instance is being built.

Steps to reproduce
==================

* Create a nova instance that will be scheduled onto baremetal.
* Determine to which node the instance has been scheduled via 'openstack baremetal node show --instance <instance UUID>'
* Wait for the ironic node to enter the 'wait call-back' state.
* Tear down the node manually via 'openstack baremetal node undeploy <node>'

Expected results
================

The ironic node becomes 'available'. The nova instance detects the
change in ironic, cleans up, and moves to an ERROR state.

Actual results
==============

The ironic node becomes 'available'. The nova instance detects the
change in ironic, cleans up the instance's networks, and stays in the
BUILD state.

Environment
===========

Pike, deployed using kolla-ansible on CentOS host with RDO packages in
CentOS containers.

openstack-nova-api-16.0.0-1.el7.noarch

Thoughts
========

I believe this is happening because the nova ironic virt driver raises
InstanceNotFound [1][2] when the ironic node is deleted or torn down.
The nova compute manager [3] interprets this as meaning the Nova
instance was deleted, and therefore does not change the instance's state
as there should be no instance to change.

[1] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/virt/ironic/driver.py#L188
[2] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/virt/ironic/driver.py#L490
[3] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/compute/manager.py#L1901

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1732506

Title:
  Baremetal instance stuck in BUILD state following ironic node tear
  down or delete

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  A baremetal (ironic) instance can become stuck in the BUILD state if
  the ironic node to which the instance has been assigned is either
  deleted or torn down manually while the instance is being built.

  Steps to reproduce
  ==================

  * Create a nova instance that will be scheduled onto baremetal.
  * Determine to which node the instance has been scheduled via 'openstack baremetal node show --instance <instance UUID>'
  * Wait for the ironic node to enter the 'wait call-back' state.
  * Tear down the node manually via 'openstack baremetal node undeploy <node>'

  Expected results
  ================

  The ironic node becomes 'available'. The nova instance detects the
  change in ironic, cleans up, and moves to an ERROR state.

  Actual results
  ==============

  The ironic node becomes 'available'. The nova instance detects the
  change in ironic, cleans up the instance's networks, and stays in the
  BUILD state.

  Environment
  ===========

  Pike, deployed using kolla-ansible on CentOS host with RDO packages in
  CentOS containers.

  openstack-nova-api-16.0.0-1.el7.noarch

  Thoughts
  ========

  I believe this is happening because the nova ironic virt driver raises
  InstanceNotFound [1][2] when the ironic node is deleted or torn down.
  The nova compute manager [3] interprets this as meaning the Nova
  instance was deleted, and therefore does not change the instance's
  state as there should be no instance to change.

  [1] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/virt/ironic/driver.py#L188
  [2] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/virt/ironic/driver.py#L490
  [3] https://github.com/openstack/nova/blob/2aa5fb3385c5c15259e0c749c46371462789dc6d/nova/compute/manager.py#L1901

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1732506/+subscriptions