← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1759316] Re: pre-cells_v2 nova-osapi_compute service in database breaks instance lookup

 

The code in [1] was added in Newton, and I think we'd be OK to add a
warning if you're not using cells v1 and the osapi_compute minimum
version is < 15 in that code as a breadcrumb at least, and we could
backport that through to queens, pike and ocata.

For nova-status, we'd likely add a check that queries the minimum nova-
osapi_compute service version across all cells (API services should
really only be in once cell though) and if < 15 we'd emit a warning. The
thing about the nova-status check would be, if you had older nova-
osapi_compute services in your nova (cell) database from before
upgrading to ocata where cells v2 was required, and then you re-
configured the API to point the [database]/connection at the nova_cell0
database and created a new 'current' service version, the cross-cell min
version check would give a warning for a cell table entry you don't
actually care about. The resolution would just be to delete that entry
though I think. Alternatively, we could just not look across cells in
nova-status and just rely on [database]/connection being set (or at
least look in cell0).

** Tags added: api cells upgrade

** Changed in: nova
       Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Also affects: nova/ocata
   Importance: Undecided
       Status: New

** Also affects: nova/queens
   Importance: Undecided
       Status: New

** Also affects: nova/pike
   Importance: Undecided
       Status: New

** Changed in: nova/ocata
       Status: New => Confirmed

** Changed in: nova/pike
       Status: New => Confirmed

** Changed in: nova/queens
       Status: New => Confirmed

** Changed in: nova/ocata
   Importance: Undecided => Medium

** Changed in: nova/pike
   Importance: Undecided => Medium

** Changed in: nova/queens
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1759316

Title:
  pre-cells_v2 nova-osapi_compute service in database breaks instance
  lookup

Status in OpenStack Compute (nova):
  Confirmed
Status in OpenStack Compute (nova) ocata series:
  Confirmed
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) queens series:
  Confirmed

Bug description:
  This was encoutered on Ocata after and upgrade from Newton, but
  affects master to the best of my knowledge.

  During our upgrade from Newton -> Ocata after we finished cells_v2
  migration and map'd instances accordingly, `nova show $uuid` no longer
  worked. Returning the error:

  {"itemNotFound": {"message": "Instance 0e1e6038-bc69-4a85-b4cc-
  779e3b1d367a could not be found.", "code": 404}}

  After much probing and with the complete lack of logs/warnings I
  discovered that the 'nova-osapi_compute' service was reporting a
  different 'host' and there were duplicate entires for the same box
  (one using the ip address, the other using the hostname of the box).
  The older entries still had version < 15. [0]

  With version less than 15 and cells_v2, the instance lookup will not
  work since it never reaches the code path needed to talk to cells_v2
  things. [1]

  The solution was to service delete the old services.

  My suggestion moving forward is to do one or more of the following:
   * place a WARN in the linked nova code [1]
   * add a check to `nova-status upgrade check` to look for old service entries

  [0] http://paste.openstack.org/show/715421/
  [1] https://github.com/openstack/nova/blob/ed55dcad83d5db2fa7e43fc3d5465df1550b554c/nova/compute/api.py#L2263-L2270

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1759316/+subscriptions


References