← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1692397] Re: hypervisor statistics could be incorrect

 

** Also affects: cloud-archive/mitaka
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/mitaka
       Status: New => Triaged

** Changed in: cloud-archive/mitaka
   Importance: Undecided => Low

** Changed in: nova (Ubuntu Xenial)
       Status: New => Triaged

** Changed in: nova (Ubuntu Xenial)
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1692397

Title:
  hypervisor statistics could be incorrect

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) newton series:
  Fix Committed
Status in OpenStack Compute (nova) ocata series:
  Fix Committed
Status in nova package in Ubuntu:
  Fix Released
Status in nova source package in Xenial:
  Triaged

Bug description:
  [Impact]

  If you deploy a nova-compute service to a node, delete that service
  (via the api), then deploy a new nova-compute service to that same
  node i.e. same hostname, the database will now have two service
  records one marked as deleted and the other not. So far so good until
  you do an 'openstack hypervisor stats show' at which point the api
  will aggregate the resource counts from both services. This has been
  fixed and backported all the way down to Newton so the problem still
  exists on Mitaka. I assume the reason why the patch was not backported
  to Mitaka is that the code in
  nova.db.sqlalchemy.apy.compute_node_statistics() changed quite a bit.
  However it only requires a one line change in the old code (that does
  the same thing as the new code) to fix this issue.

  [Test Case]

   * Deploy Mitaka with bundle http://pastebin.ubuntu.com/25968008/

   * Do 'openstack hypervisor stats show' and verify that count is 3

   * Do 'juju remove-unit nova-compute/2' to delete a compute service
  but not its physical host

   * Do 'openstack compute service delete <id>' to delete a compute
  service we just removed (choosing correct id)

   * Do 'openstack hypervisor stats show' and verify that count is 2

   * Do juju add-unit nova-compute --to <machine id of deleted unit>

   * Do 'openstack hypervisor stats show' and verify that count is 3
  (not 4 as it would be before fix)

  [Regression Potential]

  None anticipated other than for clients that were interpreting invalid
  counts as correct.

  [Other Info]
   
  ===========================================================================

  Hypervisor statistics could be incorrect:

  When we killed a nova-compute service and deleted the service from nova DB, and then
  start the nova-compute service again, the result of Hypervisor/statistics API (nova hypervisor-stats) will be
  incorrect;

  How to reproduce:

  Step1. Check the correct statistics before we do anything:
  root@SZX1000291919:/opt/stack/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 1     |
  | current_workload     | 0     |
  | disk_available_least | 14    |
  | free_disk_gb         | 34    |
  | free_ram_mb          | 6936  |
  | local_gb             | 35    |
  | local_gb_used        | 1     |
  | memory_mb            | 7960  |
  | memory_mb_used       | 1024  |
  | running_vms          | 1     |
  | vcpus                | 8     |
  | vcpus_used           | 1     |
  +----------------------+-------+

  Step2. Kill the compute service:
  root@SZX1000291919:/var/log/nova# ps -ef | grep nova-com
  root     120419 120411  0 11:06 pts/27   00:00:00 sg libvirtd /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log
  root     120420 120419  0 11:06 pts/27   00:00:07 /usr/bin/python /usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log

  root@SZX1000291919:/var/log/nova# kill -9 120419
  root@SZX1000291919:/var/log/nova# /usr/local/bin/stack: line 19: 120419 Killed                  sg libvirtd '/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf --log-file /var/log/nova/nova-compute.log' > /dev/null 2>&1

  root@SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:36.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:36.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:24:37.000000 | -               |
  | 8  | nova-compute     | SZX1000291919 | nova     | enabled | down  | 2017-05-22T03:23:38.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step3. Delete the service from DB:

  root@SZX1000291919:/var/log/nova# nova service-delete 8
  root@SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:16.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:16.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:25:17.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step4. Start the compute service again:
  root@SZX1000291919:/var/log/nova# nova service-list
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
  | 4  | nova-conductor   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:55.000000 | -               |
  | 6  | nova-scheduler   | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:56.000000 | -               |
  | 7  | nova-consoleauth | SZX1000291919 | internal | enabled | up    | 2017-05-22T03:48:56.000000 | -               |
  | 9  | nova-cert        | SZX1000291919 | internal | enabled | down  | 2017-05-17T02:50:13.000000 | -               |
  | 10 | nova-compute     | SZX1000291919 | nova     | enabled | up    | 2017-05-22T03:48:57.000000 | -               |
  +----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

  Step5. Check again the hyervisor statistics, the result is incorrect:

  root@SZX1000291919:/var/log/nova# nova  hypervisor-stats
  +----------------------+-------+
  | Property             | Value |
  +----------------------+-------+
  | count                | 2     |
  | current_workload     | 0     |
  | disk_available_least | 28    |
  | free_disk_gb         | 68    |
  | free_ram_mb          | 13872 |
  | local_gb             | 70    |
  | local_gb_used        | 2     |
  | memory_mb            | 15920 |
  | memory_mb_used       | 2048  |
  | running_vms          | 2     |
  | vcpus                | 16    |
  | vcpus_used           | 2     |
  +----------------------+-------+

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1692397/+subscriptions