← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1778305] [NEW] Nova may erronously look up service version of a deleted service, when hostname have been reused

 

Public bug reported:

Prerequisites:

- A compute node running an old version of nova has been deleted. (In our case, version 9)
- The hostname of said compute node has been reused, and has been upgraded as per normal. (To version 16)
- The services table in the nova database contains both the old and the new node defined, where the deleted one are clearly marked as deleted - and with the old version specified in the version column.  The new node also exist, upgraded as it is.
- One has at least one instance running on the upgraded node.
- Perform upgrade from ocata to pike
- Any projects with instances running on the upgraded node, may erronously get an error message that "ERROR (BadRequest): This service is older (v9) than the minimum (v16) version of the rest of the deployment. Unable to continue. (HTTP 400) (Request-ID: req-3e0ababe-e09b-4ef8-ba3a-43060bc1f807)" --- when performing 'nova list'.


Example of how this may look in the database:

MariaDB [nova]> SELECT * FROM services WHERE host = 'node11.acme.org';
+---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
| created_at          | updated_at          | deleted_at          | id  | host            | binary       | topic   | report_count | disabled | deleted | disabled_reason | last_seen_up        | forced_down | version | uuid                                 |
+---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
| 2017-10-17 13:06:10 | 2018-06-22 21:42:42 | NULL                | 179 | node11.acme.org | nova-compute | compute |      2138069 |        0 |       0 | NULL            | 2018-06-22 21:42:42 |           0 |      22 | 63e1cb55-ee00-4cb8-b304-160dd5c45fdd |
| 2016-08-13 08:20:05 | 2016-11-15 00:01:21 | 2016-11-27 15:11:30 | 104 | node11.acme.org | nova-compute | compute |       796220 |        1 |     104 | NULL            | 2016-11-15 00:01:21 |           0 |       9 | NULL                                 |
+---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
2 rows in set (0.01 sec)


Removing the old service from the database is an effective workaround
for this problem.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1778305

Title:
  Nova may erronously look up service version of a deleted service, when
  hostname have been reused

Status in OpenStack Compute (nova):
  New

Bug description:
  Prerequisites:

  - A compute node running an old version of nova has been deleted. (In our case, version 9)
  - The hostname of said compute node has been reused, and has been upgraded as per normal. (To version 16)
  - The services table in the nova database contains both the old and the new node defined, where the deleted one are clearly marked as deleted - and with the old version specified in the version column.  The new node also exist, upgraded as it is.
  - One has at least one instance running on the upgraded node.
  - Perform upgrade from ocata to pike
  - Any projects with instances running on the upgraded node, may erronously get an error message that "ERROR (BadRequest): This service is older (v9) than the minimum (v16) version of the rest of the deployment. Unable to continue. (HTTP 400) (Request-ID: req-3e0ababe-e09b-4ef8-ba3a-43060bc1f807)" --- when performing 'nova list'.

  
  Example of how this may look in the database:

  MariaDB [nova]> SELECT * FROM services WHERE host = 'node11.acme.org';
  +---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
  | created_at          | updated_at          | deleted_at          | id  | host            | binary       | topic   | report_count | disabled | deleted | disabled_reason | last_seen_up        | forced_down | version | uuid                                 |
  +---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
  | 2017-10-17 13:06:10 | 2018-06-22 21:42:42 | NULL                | 179 | node11.acme.org | nova-compute | compute |      2138069 |        0 |       0 | NULL            | 2018-06-22 21:42:42 |           0 |      22 | 63e1cb55-ee00-4cb8-b304-160dd5c45fdd |
  | 2016-08-13 08:20:05 | 2016-11-15 00:01:21 | 2016-11-27 15:11:30 | 104 | node11.acme.org | nova-compute | compute |       796220 |        1 |     104 | NULL            | 2016-11-15 00:01:21 |           0 |       9 | NULL                                 |
  +---------------------+---------------------+---------------------+-----+-----------------+--------------+---------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------------------------------------+
  2 rows in set (0.01 sec)


  Removing the old service from the database is an effective workaround
  for this problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1778305/+subscriptions


Follow ups