← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1184470] Re: baremetal driver needs a state between "building" and "deploying"

 

I took a look at the Ironic PXE driver's handling of this sort of
situation, and while I think it's OK and not affected by the precise
circumstances described in this bug, I think there may be some similar
difficulty in determining why a deploy failed part-way through.

I've tagged the bug as also-affecting and will look into it.

** Also affects: ironic
   Importance: Undecided
       Status: New

** Changed in: ironic
     Assignee: (unassigned) => Devananda van der Veen (devananda)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1184470

Title:
  baremetal driver needs a state between "building" and "deploying"

Status in Ironic (Bare Metal Provisioning):
  New
Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  It is not possible to tell from the baremetal node status that a
  deployment has failed because a machine's BIOS hung or was improperly
  configured. This would be discernable with an additional state change
  between BUILDING and DEPLOYING.

  Details
  =====

  During a baremetal deployment, the state is tracked in the
  nova_bm.bm_nodes table. The state is set to BUILDING when
  virt/driver/baremetal.py:driver.spawn() acquires the node and begins
  preparing the deployment. After the power_driver's activate_node()
  method is called, the PXE driver goes into a wait loop to see when the
  deployment is done. The state is changed to DEPLOYING when baremetal-
  deploy-helper receives a connection from the deployment ramdisk, and
  then either set to DEPLOYDONE or DEPLOYFAIL, accordingly.

  There is a middle step which is not currently represented. If the
  baremetal node powers on but never connects to the deploy-helper, it
  is impossible to tell from the database whether the deploy environment
  was not created or whether the machine is dead.

  Proposed fix
  ==========

  Add a PREPARED state to baremetal_states.py, and set the node to this
  state immediately after calling activate_node().

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1184470/+subscriptions