← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1841476] [NEW] Spurious ComputeHostNotFound warnings in nova-compute logs during ironic node re-balance

 

Public bug reported:

Seen here:

https://d01b2e57f0a56cb7edf0-b6bc206936c08bb07a5f77cfa916a2d4.ssl.cf5.rackcdn.com/678298/4/check
/ironic-tempest-ipa-wholedisk-direct-tinyipa-
multinode/92c65ac/compute1/logs/screen-n-cpu.txt.gz

We see a warning that a compute node could not be found by host and node
but then later is found just by nodename and is moving to the current
host:

Aug 26 18:41:38.800657 ubuntu-bionic-rax-ord-0010443319 nova-
compute[747]: WARNING nova.compute.resource_tracker [None req-a894abee-
a2f1-4423-8ede-2a1b9eef28a4 None None] No compute node record for
ubuntu-bionic-rax-ord-0010443319:61dbc9c7-828b-4c42-b19c-a3716037965f:
ComputeHostNotFound_Remote: Compute host ubuntu-bionic-rax-
ord-0010443319 could not be found.

Aug 26 18:41:38.818412 ubuntu-bionic-rax-ord-0010443319 nova-
compute[747]: INFO nova.compute.resource_tracker [None req-a894abee-
a2f1-4423-8ede-2a1b9eef28a4 None None] ComputeNode 61dbc9c7-828b-4c42
-b19c-a3716037965f moving from ubuntu-bionic-rax-ord-0010443317 to
ubuntu-bionic-rax-ord-0010443319

The warning comes from this call:

https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L554

And the re-balance is found here:

https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L561

The warning is then a red herring. We could:

1. add something to the warning message saying this could be due to a
re-balance but that might be confusing for non-ironic computes

and/or

2. check if self.driver.rebalances_nodes and if True, change the warning
to an info level message (and potentially modify the message with the
re-balance wording in #1 above).

** Affects: nova
     Importance: Low
         Status: Triaged


** Tags: ironic resource-tracker serviceability

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1841476

Title:
  Spurious ComputeHostNotFound warnings in nova-compute logs during
  ironic node re-balance

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Seen here:

  https://d01b2e57f0a56cb7edf0-b6bc206936c08bb07a5f77cfa916a2d4.ssl.cf5.rackcdn.com/678298/4/check
  /ironic-tempest-ipa-wholedisk-direct-tinyipa-
  multinode/92c65ac/compute1/logs/screen-n-cpu.txt.gz

  We see a warning that a compute node could not be found by host and
  node but then later is found just by nodename and is moving to the
  current host:

  Aug 26 18:41:38.800657 ubuntu-bionic-rax-ord-0010443319 nova-
  compute[747]: WARNING nova.compute.resource_tracker [None req-
  a894abee-a2f1-4423-8ede-2a1b9eef28a4 None None] No compute node record
  for ubuntu-bionic-rax-ord-0010443319:61dbc9c7-828b-4c42-b19c-
  a3716037965f: ComputeHostNotFound_Remote: Compute host ubuntu-bionic-
  rax-ord-0010443319 could not be found.

  Aug 26 18:41:38.818412 ubuntu-bionic-rax-ord-0010443319 nova-
  compute[747]: INFO nova.compute.resource_tracker [None req-a894abee-
  a2f1-4423-8ede-2a1b9eef28a4 None None] ComputeNode 61dbc9c7-828b-4c42
  -b19c-a3716037965f moving from ubuntu-bionic-rax-ord-0010443317 to
  ubuntu-bionic-rax-ord-0010443319

  The warning comes from this call:

  https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L554

  And the re-balance is found here:

  https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L561

  The warning is then a red herring. We could:

  1. add something to the warning message saying this could be due to a
  re-balance but that might be confusing for non-ironic computes

  and/or

  2. check if self.driver.rebalances_nodes and if True, change the
  warning to an info level message (and potentially modify the message
  with the re-balance wording in #1 above).

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1841476/+subscriptions