← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1178378] Re: confused baremetal instance thinks its off, is clearly operational

 

** Changed in: tripleo
       Status: Triaged => Fix Committed

** Changed in: tripleo
     Assignee: (unassigned) => Devananda van der Veen (devananda)

** Changed in: tripleo
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1178378

Title:
  confused baremetal instance thinks its off, is clearly operational

Status in OpenStack Compute (Nova):
  Fix Committed
Status in tripleo - openstack on openstack:
  Fix Released

Bug description:
  | 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 | foo  | SHUTOFF | None       | Shutdown    |
                    ctlplane=192.0.2.34 |
  but
  64 bytes from 192.0.2.34: icmp_req=1 ttl=64 time=0.491 ms
  64 bytes from 192.0.2.34: icmp_req=2 ttl=64 time=0.390 ms

  and I'm also connected to the console and it's running as it was ~ 8 hours ago when I went to bed.
   uptime
   14:12:21 up  9:50,  1 user,  load average: 0.00, 0.01, 0.05

  I've marked this critical because once the confusion happens, the next time IPMI polling works, this is logged:
  [instance: bfb6957b-d23d-4716-bc9d-96d2878b738d] Instance is not stopped. Calling the stop 
                    API.

  and nova promptly powers the machine off via IPMI, destroying any
  state on the machine.

  It can be powered back on via 'nova start $UUID'.

  mysql> select * from instances where uuid='7b12b6aa-ea62-4165-82b5-68c1ebdfbb45';
  +---------------------+---------------------+------------+----+-------------+----------------------------------+----------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------------+----------+----------+-------------+----------+-----------+-------+----------+--------+-----------+----------------+---------------------+---------------------+---------------+--------------+---------------------+-------------------+--------+---------+-------------+------------------+---------+--------------------------------------+--------------+------------------+--------------+--------------+--------------+------------+--------------------------+---------------------+----------+------------------+--------------------+-------------------+---------+--------------+-----------+--------------------------------------+---------+
  | created_at          | updated_at          | deleted_at | id | internal_id | user_id                          | project_id                       | image_ref                            | kernel_id                            | ramdisk_id                           | launch_index | key_name | key_data | power_state | vm_state | memory_mb | vcpus | hostname | host   | user_data | reservation_id | scheduled_at        | launched_at         | terminated_at | display_name | display_description | availability_zone | locked | os_type | launched_on | instance_type_id | vm_mode | uuid                                 | architecture | root_device_name | access_ip_v4 | access_ip_v6 | config_drive | task_state | default_ephemeral_device | default_swap_device | progress | auto_disk_config | shutdown_terminate | disable_terminate | root_gb | ephemeral_gb | cell_name | node                                 | deleted |
  +---------------------+---------------------+------------+----+-------------+----------------------------------+----------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------------+----------+----------+-------------+----------+-----------+-------+----------+--------+-----------+----------------+---------------------+---------------------+---------------+--------------+---------------------+-------------------+--------+---------+-------------+------------------+---------+--------------------------------------+--------------+------------------+--------------+--------------+--------------+------------+--------------------------+---------------------+----------+------------------+--------------------+-------------------+---------+--------------+-----------+--------------------------------------+---------+
  | 2013-05-09 08:48:15 | 2013-05-09 09:05:26 | NULL       | 57 |        NULL | baa113f6f7994ddd9c7d86945768616e | 0d4df5d4fee24f18b8b5f425eb81e0c4 | 0b6e64a9-4867-4742-a964-f52951c93123 | 615727b1-07d2-4071-b5f4-6ee432972df9 | 9e5d7470-f3ed-4604-8fe7-894822d86016 |            0 | NULL     | NULL     |           4 | stopped  |       512 |     1 | foo      | ubuntu | NULL      | r-kunwlpa2     | 2013-05-09 08:48:15 | 2013-05-09 08:54:59 | NULL          | foo          | foo                 | NULL              |      0 | NULL    | ubuntu      |                6 | NULL    | 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 | NULL         | NULL             | NULL         | NULL         |              | NULL       | NULL                     | NULL                |        0 |             NULL |                  0 |                 0 |      10 |            0 | NULL      | 96deccd5-0ad9-4bb5-979b-009bebac52fc |       0 |
  +---------------------+---------------------+------------+----+-------------+----------------------------------+----------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------------+----------+----------+-------------+----------+-----------+-------+----------+--------+-----------+----------------+---------------------+---------------------+---------------+--------------+---------------------+-------------------+--------+---------+-------------+------------------+---------+--------------------------------------+--------------+------------------+--------------+--------------+--------------+------------+--------------------------+---------------------+----------+------------------+--------------------+-------------------+---------+--------------+-----------+--------------------------------------+---------+

  select * from nova_bm.bm_nodes where instance_uuid='7b12b6aa-ea62-4165-82b5-68c1ebdfbb45';
  +---------------------+---------------------+------------+---------+----+------+-----------+----------+-------------+---------------+-------------+--------------+------------------+--------------------------------------+------------+---------------+------------------------------------------------+-------------------------------------------------------+----------------------------------+---------+---------+--------------------------------------+---------------+
  | created_at          | updated_at          | deleted_at | deleted | id | cpus | memory_mb | local_gb | pm_address  | pm_user       | pm_password | service_host | prov_mac_address | instance_uuid                        | task_state | terminal_port | image_path                                     | pxe_config_path                                       | deploy_key                       | root_mb | swap_mb | uuid                                 | instance_name |
  +---------------------+---------------------+------------+---------+----+------+-----------+----------+-------------+---------------+-------------+--------------+------------------+--------------------------------------+------------+---------------+------------------------------------------------+-------------------------------------------------------+----------------------------------+---------+---------+--------------------------------------+---------------+
  | 2013-05-09 00:41:48 | 2013-05-09 08:54:59 | NULL       |       0 | 15 |   24 |     98304 |     2048 | 10.10.16.45 | Administrator | 42560532    | ubuntu       | NULL             | 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 | active     |          NULL | /var/lib/nova/instances/instance-00000039/disk | /tftpboot/7b12b6aa-ea62-4165-82b5-68c1ebdfbb45/config | RDC5IT9ZQL0DPTYM0KZF5VCI22Q2K4GE |   10240 |       1 | 96deccd5-0ad9-4bb5-979b-009bebac52fc | foo           |
  +---------------------+---------------------+------------+---------+----+------+-----------+----------+-------------+---------------+-------------+--------------+------------------+--------------------------------------+------------+---------------+------------------------------------------------+-------------------------------------------------------+----------------------------------+---------+---------+--------------------------------------+---------------+

  select * from compute_nodes where compute_nodes.hypervisor_hostname='96deccd5-0ad9-4bb5-979b-009bebac52fc';
  +---------------------+---------------------+------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+---------------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+
  | created_at          | updated_at          | deleted_at | id | service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | local_gb_used | hypervisor_type | hypervisor_version | cpu_info      | disk_available_least | free_ram_mb | free_disk_gb | current_workload | running_vms | hypervisor_hostname                  | deleted |
  +---------------------+---------------------+------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+---------------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+
  | 2013-05-09 00:42:45 | 2013-05-09 18:46:06 | NULL       | 16 |          5 |    24 |     98304 |     2048 |          1 |            512 |            10 | baremetal       |                  1 | baremetal cpu |                 NULL |       97792 |         2038 |                0 |           1 | 96deccd5-0ad9-4bb5-979b-009bebac52fc |       0 |
  +---------------------+---------------------+------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+---------------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+
  1 row in set (0.00 sec)

   grep 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 /var/log/upstart/nova-compute.log
  2013-05-09 08:48:15,392.392 19736 AUDIT nova.compute.manager [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Starting instance...
  2013-05-09 08:48:15,924.924 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Attempting claim: memory 512 MB, disk 10 GB, VCPUs 1
  2013-05-09 08:48:15,924.924 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Total Memory: 98304 MB, used: 0 MB
  2013-05-09 08:48:15,925.925 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Memory limit: 98304 MB, free: 98304 MB
  2013-05-09 08:48:15,925.925 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Total Disk: 2048 GB, used: 0 GB
  2013-05-09 08:48:15,925.925 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Disk limit not specified, defaulting to unlimited
  2013-05-09 08:48:15,926.926 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Total CPU: 24 VCPUs, used: 0 VCPUs
  2013-05-09 08:48:15,926.926 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] CPU limit not specified, defaulting to unlimited
  2013-05-09 08:48:15,926.926 19736 AUDIT nova.compute.claims [req-9671f69f-b656-4bf7-8acf-5186d721b75e baa113f6f7994ddd9c7d86945768616e 0d4df5d4fee24f18b8b5f425eb81e0c4] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Claim successful
  2013-05-09 08:53:51,373.373 19736 INFO nova.virt.baremetal.pxe [-] PXE deploy started for instance 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45
  2013-05-09 08:54:59,414.414 19736 INFO nova.virt.baremetal.pxe [-] PXE deploy completed for instance 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45
  2013-05-09 09:05:25,370.370 19736 WARNING nova.compute.manager [-] [instance: 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45] Instance shutdown by itself. Calling the stop API.

  # zgrep 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 /var/log/upstart/nova-scheduler.log.2.gz
  # zgrep 7b12b6aa-ea62-4165-82b5-68c1ebdfbb45 /var/log/upstart/nova-scheduler.log.1.gz
  #

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1178378/+subscriptions