← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1710141] Re: Continual warnings in n-cpu logs about being unable to delete inventory for an ironic node with an instance on it

 

One question is, why don't we report inventory for an ACTIVE node? If
the inventory is 1 but an instance is also allocating that 1 of whatever
resource class, then isn't that sufficient? In other words, if an
instance is consuming all of the node inventory, that should take the
node out of scheduling decisions for building new instances, which is
also how things work for regular compute nodes for building VMs.

** Also affects: nova/ocata
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1710141

Title:
  Continual warnings in n-cpu logs about being unable to delete
  inventory for an ironic node with an instance on it

Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Compute (nova) ocata series:
  New

Bug description:
  Seen here:

  http://logs.openstack.org/54/487954/12/check/gate-tempest-dsvm-ironic-
  ipa-wholedisk-bios-agent_ipmitool-tinyipa-ubuntu-xenial-
  nv/041c03a/logs/screen-n-cpu.txt.gz#_Aug_09_19_31_21_450705

  Aug 09 19:31:21.450705 ubuntu-xenial-internap-mtl01-10351013 nova-
  compute[19132]: WARNING nova.scheduler.client.report [None req-
  9db22a6d-e88a-42b0-879e-8fe523dcc664 None None] [req-
  2eead243-5e63-4dd0-a208-4ceed95478ff] We cannot delete inventory
  'VCPU, MEMORY_MB, DISK_GB' for resource provider 38b274b2-2e37-4c23
  -ad6f-d86c1f0a0e3f because the inventory is in use.

  As soon as an ironic node has an instance built on it, the node state
  is ACTIVE which means that this method returns True:

  https://github.com/openstack/nova/blob/c2d33c3271370358d48553233b41bf9119d834fb/nova/virt/ironic/driver.py#L176

  Saying the node is unavailable, because it's wholly consumed I guess.

  That's used here:

  https://github.com/openstack/nova/blob/c2d33c3271370358d48553233b41bf9119d834fb/nova/virt/ironic/driver.py#L324

  And that's checked here when reporting inventory to the resource
  tracker:

  https://github.com/openstack/nova/blob/c2d33c3271370358d48553233b41bf9119d834fb/nova/virt/ironic/driver.py#L741

  Which then tries to delete the inventory for the node resource
  provider in placement, which fails because it's already got an
  instance running on it that is consuming inventory:

  http://logs.openstack.org/54/487954/12/check/gate-tempest-dsvm-ironic-
  ipa-wholedisk-bios-agent_ipmitool-tinyipa-ubuntu-xenial-
  nv/041c03a/logs/screen-n-cpu.txt.gz#_Aug_09_19_31_21_450705

  Aug 09 19:31:21.391146 ubuntu-xenial-internap-mtl01-10351013 nova-compute[19132]: INFO nova.scheduler.client.report [None req-9db22a6d-e88a-42b0-879e-8fe523dcc664 None None] Compute node 38b274b2-2e37-4c23-ad6f-d86c1f0a0e3f reported no inventory but previous inventory was detected. Deleting existing inventory records.
  Aug 09 19:31:21.450705 ubuntu-xenial-internap-mtl01-10351013 nova-compute[19132]: WARNING nova.scheduler.client.report [None req-9db22a6d-e88a-42b0-879e-8fe523dcc664 None None] [req-2eead243-5e63-4dd0-a208-4ceed95478ff] We cannot delete inventory 'VCPU, MEMORY_MB, DISK_GB' for resource provider 38b274b2-2e37-4c23-ad6f-d86c1f0a0e3f because the inventory is in use.

  This is also bad because if the node was updated with a
  resource_class, that resource class won't be automatically created in
  Placement here:

  https://github.com/openstack/nova/blob/c2d33c3271370358d48553233b41bf9119d834fb/nova/scheduler/client/report.py#L789

  Because the driver didn't report it in the get_inventory method.

  And that has an impact on this code to migrate
  instance.flavor.extra_specs to have custom resource class overrides
  from ironic nodes that now have a resource_class set:

  https://review.openstack.org/#/c/487954/

  So we've got a bit of a chicken and egg problem here.

  Manually testing the ironic flavor migration code hits this problem,
  as seen here:

  http://paste.openstack.org/show/618160/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1710141/+subscriptions


References