← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2077070] [NEW] Delete of nova-compute service fails if PCI devices

 

Public bug reported:

Hi,

If a nova-compute service has PCI devices attached, doing "openstack
compute service delete <UUID>" looks like working, but in fact, it
doesn't delete the node from the compute_nodes. This is because if the
node has PCI devices, there's a foreign key constraint in the
compute_nodes table. Therefore, the node cannot be deleted from the
table, unless the pci_devices table is cleaned first.

As a consequence, after re-adding the node to the cluster, if someone
tries to spawn a VM on that compute node, that VM falls into error with
the message "the host isn't mapped to any cell", even if the
discover_host has been done before, which is very deceptive.

To fix this, what I had to do was:
DELETE FROM pci_devices WHERE compute_node_id='43';
DELETE FROM compute_nodes WHERE host='<host-name>';

then "openstack compute service delete <uuid>".

Later on, after doing that, restarting nova-compute and discover_host
does what one expects and everything becomes normal again.

Please note that this happened to us on a newly spawn cluster running
Caracal. We had to reinstall the compute node because of hardware issue,
and this lead to the above.

Cheers,

Thomas Goirand (zigo)

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2077070

Title:
  Delete of nova-compute service fails if PCI devices

Status in OpenStack Compute (nova):
  New

Bug description:
  Hi,

  If a nova-compute service has PCI devices attached, doing "openstack
  compute service delete <UUID>" looks like working, but in fact, it
  doesn't delete the node from the compute_nodes. This is because if the
  node has PCI devices, there's a foreign key constraint in the
  compute_nodes table. Therefore, the node cannot be deleted from the
  table, unless the pci_devices table is cleaned first.

  As a consequence, after re-adding the node to the cluster, if someone
  tries to spawn a VM on that compute node, that VM falls into error
  with the message "the host isn't mapped to any cell", even if the
  discover_host has been done before, which is very deceptive.

  To fix this, what I had to do was:
  DELETE FROM pci_devices WHERE compute_node_id='43';
  DELETE FROM compute_nodes WHERE host='<host-name>';

  then "openstack compute service delete <uuid>".

  Later on, after doing that, restarting nova-compute and discover_host
  does what one expects and everything becomes normal again.

  Please note that this happened to us on a newly spawn cluster running
  Caracal. We had to reinstall the compute node because of hardware
  issue, and this lead to the above.

  Cheers,

  Thomas Goirand (zigo)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2077070/+subscriptions