← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1661258] [NEW] Deleted ironic node has an inventory in nova_api database

 

Public bug reported:

Running latest devstack, ironic and nova, I get the following error when
I request an instance:

| fault                                | {"message": "Node 6cc8803d-4e77-4948-b653-663d8d5e52b7 could not be found. (HTTP 404)", "code": 500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 1780, in _do_build_and_run_instance |
|                                      |     filter_properties)                                                                                                                                                                                        |
|                                      |   File \"/opt/stack/nova/nova/compute/manager.py\", line 2016, in _build_and_run_instance                                                                                                                     |
|                                      |     instance_uuid=instance.uuid, reason=six.text_type(e))                                                                                                                                                     |
|                                      | ", "created": "2017-02-02T13:42:01Z"}                                                                                                                                                                         |

On ironic side, this node was indeed deleted, it is also deleted from
nova.compute_nodes table:

| created_at          | updated_at          | deleted_at          | id | service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | local_gb_used | hypervisor_type | hypervisor_version | cpu_info | disk_available_least | free_ram_mb | free_disk_gb | current_workload | running_vms | hypervisor_hostname                  | deleted | host_ip        | supported_instances              | pci_stats                                                                                                                                                                         | metrics | extra_resources | stats                  | numa_topology | host   | ram_allocation_ratio | cpu_allocation_ratio | uuid                                 | disk_allocation_ratio |
...................................................
| 2017-02-02 12:20:27 | 2017-02-02 13:20:15 | 2017-02-02 13:21:15 |  2 |       NULL |     1 |      1536 |       10 |          0 |              0 |             0 | ironic          |                  1 |          |                   10 |        1536 |           10 |                0 |           0 | 6cc8803d-4e77-4948-b653-663d8d5e52b7 |       2 | 192.168.122.22 | [["x86_64", "baremetal", "hvm"]] | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"} | []      | NULL            | {"cpu_arch": "x86_64"} | NULL          | ubuntu |                    1 |                    0 | 035be695-0797-44b3-930b-42349e40579e |                     0 |

But in nova_api.inventories it's still there:

| created_at          | updated_at | id | resource_provider_id | resource_class_id | total | reserved | min_unit | max_unit | step_size | allocation_ratio |
..........................
| 2017-02-02 13:20:14 | NULL       | 13 |                    2 |                 0 |     1 |        0 |        1 |        1 |         1 |               16 |
| 2017-02-02 13:20:14 | NULL       | 14 |                    2 |                 1 |  1536 |        0 |        1 |     1536 |         1 |                1 |
| 2017-02-02 13:20:14 | NULL       | 15 |                    2 |                 2 |    10 |        0 |        1 |       10 |         1 |                1 |

nova_api.resource_providers bit:
| created_at          | updated_at          | id | uuid                                 | name                                 | generation | can_host |
.........................
| 2017-02-02 12:20:27 | 2017-02-02 13:20:14 |  2 | 035be695-0797-44b3-930b-42349e40579e | 6cc8803d-4e77-4948-b653-663d8d5e52b7 |          7 |        0 |

Waiting for resource tracker run did not help, node's been deleted for
~30 minutes already and the inventory is still there.

Code versions:
Devstack commit debc695ddfc8b7b2aeb53c01c624e15f69ed9fa2 Updated from generate-devstack-plugins-list.
Nova commit 5dad7eaef7f8562425cce6b233aed610ca2d3148 Merge "doc: update the man page entry for nova-manage db sync"
Ironic commit 5071b99835143ebcae876432e2982fd27faece10 Merge "Remove deprecated heartbeat policy check"

If it is anyhow relevant, I also run two nova-computes on the same host,
I've set host=test for the second one, other than that all configs are
the same. I was trying to reproduce another cell-related issue, and was
creating/deleting ironic nodes, so that they map to the second nova-
compute by the hash_ring.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: ironic placement resource-tracker

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1661258

Title:
  Deleted ironic node has an inventory in nova_api database

Status in OpenStack Compute (nova):
  New

Bug description:
  Running latest devstack, ironic and nova, I get the following error
  when I request an instance:

  | fault                                | {"message": "Node 6cc8803d-4e77-4948-b653-663d8d5e52b7 could not be found. (HTTP 404)", "code": 500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 1780, in _do_build_and_run_instance |
  |                                      |     filter_properties)                                                                                                                                                                                        |
  |                                      |   File \"/opt/stack/nova/nova/compute/manager.py\", line 2016, in _build_and_run_instance                                                                                                                     |
  |                                      |     instance_uuid=instance.uuid, reason=six.text_type(e))                                                                                                                                                     |
  |                                      | ", "created": "2017-02-02T13:42:01Z"}                                                                                                                                                                         |

  On ironic side, this node was indeed deleted, it is also deleted from
  nova.compute_nodes table:

  | created_at          | updated_at          | deleted_at          | id | service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | local_gb_used | hypervisor_type | hypervisor_version | cpu_info | disk_available_least | free_ram_mb | free_disk_gb | current_workload | running_vms | hypervisor_hostname                  | deleted | host_ip        | supported_instances              | pci_stats                                                                                                                                                                         | metrics | extra_resources | stats                  | numa_topology | host   | ram_allocation_ratio | cpu_allocation_ratio | uuid                                 | disk_allocation_ratio |
  ...................................................
  | 2017-02-02 12:20:27 | 2017-02-02 13:20:15 | 2017-02-02 13:21:15 |  2 |       NULL |     1 |      1536 |       10 |          0 |              0 |             0 | ironic          |                  1 |          |                   10 |        1536 |           10 |                0 |           0 | 6cc8803d-4e77-4948-b653-663d8d5e52b7 |       2 | 192.168.122.22 | [["x86_64", "baremetal", "hvm"]] | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"} | []      | NULL            | {"cpu_arch": "x86_64"} | NULL          | ubuntu |                    1 |                    0 | 035be695-0797-44b3-930b-42349e40579e |                     0 |

  But in nova_api.inventories it's still there:

  | created_at          | updated_at | id | resource_provider_id | resource_class_id | total | reserved | min_unit | max_unit | step_size | allocation_ratio |
  ..........................
  | 2017-02-02 13:20:14 | NULL       | 13 |                    2 |                 0 |     1 |        0 |        1 |        1 |         1 |               16 |
  | 2017-02-02 13:20:14 | NULL       | 14 |                    2 |                 1 |  1536 |        0 |        1 |     1536 |         1 |                1 |
  | 2017-02-02 13:20:14 | NULL       | 15 |                    2 |                 2 |    10 |        0 |        1 |       10 |         1 |                1 |

  nova_api.resource_providers bit:
  | created_at          | updated_at          | id | uuid                                 | name                                 | generation | can_host |
  .........................
  | 2017-02-02 12:20:27 | 2017-02-02 13:20:14 |  2 | 035be695-0797-44b3-930b-42349e40579e | 6cc8803d-4e77-4948-b653-663d8d5e52b7 |          7 |        0 |

  Waiting for resource tracker run did not help, node's been deleted for
  ~30 minutes already and the inventory is still there.

  Code versions:
  Devstack commit debc695ddfc8b7b2aeb53c01c624e15f69ed9fa2 Updated from generate-devstack-plugins-list.
  Nova commit 5dad7eaef7f8562425cce6b233aed610ca2d3148 Merge "doc: update the man page entry for nova-manage db sync"
  Ironic commit 5071b99835143ebcae876432e2982fd27faece10 Merge "Remove deprecated heartbeat policy check"

  If it is anyhow relevant, I also run two nova-computes on the same
  host, I've set host=test for the second one, other than that all
  configs are the same. I was trying to reproduce another cell-related
  issue, and was creating/deleting ironic nodes, so that they map to the
  second nova-compute by the hash_ring.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1661258/+subscriptions


Follow ups