← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1759316] [NEW] pre-cells_v2 nova-osapi_compute service in database breaks instance lookup

 

Public bug reported:

This was encoutered on Ocata after and upgrade from Newton, but affects
master to the best of my knowledge.

During our upgrade from Newton -> Ocata after we finished cells_v2
migration and map'd instances accordingly, `nova show $uuid` no longer
worked. Returning the error:

{"itemNotFound": {"message": "Instance 0e1e6038-bc69-4a85-b4cc-
779e3b1d367a could not be found.", "code": 404}}

After much probing and with the complete lack of logs/warnings I
discovered that the 'nova-osapi_compute' service was reporting a
different 'host' and there were duplicate entires for the same box (one
using the ip address, the other using the hostname of the box). The
older entries still had version < 15. [0]

With version less than 15 and cells_v2, the instance lookup will not
work since it never reaches the code path needed to talk to cells_v2
things. [1]

The solution was to service delete the old services.

My suggestion moving forward is to do one or more of the following:
 * place a WARN in the linked nova code [1]
 * add a check to `nova-status upgrade check` to look for old service entries

[0] http://paste.openstack.org/show/715421/
[1] https://github.com/openstack/nova/blob/ed55dcad83d5db2fa7e43fc3d5465df1550b554c/nova/compute/api.py#L2263-L2270

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1759316

Title:
  pre-cells_v2 nova-osapi_compute service in database breaks instance
  lookup

Status in OpenStack Compute (nova):
  New

Bug description:
  This was encoutered on Ocata after and upgrade from Newton, but
  affects master to the best of my knowledge.

  During our upgrade from Newton -> Ocata after we finished cells_v2
  migration and map'd instances accordingly, `nova show $uuid` no longer
  worked. Returning the error:

  {"itemNotFound": {"message": "Instance 0e1e6038-bc69-4a85-b4cc-
  779e3b1d367a could not be found.", "code": 404}}

  After much probing and with the complete lack of logs/warnings I
  discovered that the 'nova-osapi_compute' service was reporting a
  different 'host' and there were duplicate entires for the same box
  (one using the ip address, the other using the hostname of the box).
  The older entries still had version < 15. [0]

  With version less than 15 and cells_v2, the instance lookup will not
  work since it never reaches the code path needed to talk to cells_v2
  things. [1]

  The solution was to service delete the old services.

  My suggestion moving forward is to do one or more of the following:
   * place a WARN in the linked nova code [1]
   * add a check to `nova-status upgrade check` to look for old service entries

  [0] http://paste.openstack.org/show/715421/
  [1] https://github.com/openstack/nova/blob/ed55dcad83d5db2fa7e43fc3d5465df1550b554c/nova/compute/api.py#L2263-L2270

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1759316/+subscriptions


Follow ups