← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1937292] Re: All overcloud VM's powered off on hypervisor when nova_libvirt is restarted

 

Marking this as incomplete for Nova and adding TripleO.

The n-cpu logs just show the compute service reacting to the instances
being stopped *already* on the host:

2021-07-22 16:31:30.930 3 WARNING nova.compute.manager [-] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state:
1, current VM power_state: 4

1 == RUNNING
4 == SHUTDOWN

https://github.com/openstack/nova/blob/b241663b8929b638a1795c5cc2f859b103a1d468/nova/objects/fields.py#L1026-L1045



** Changed in: nova
       Status: New => Incomplete

** Also affects: tripleo
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1937292

Title:
  All overcloud VM's powered off on hypervisor when nova_libvirt is
  restarted

Status in OpenStack Compute (nova):
  Incomplete
Status in tripleo:
  New

Bug description:
  Description:

  Using TripleO. Noted that all VM's on a Hypervisor are powered off
  during the overcloud deployment. (I only have one Hypervisor sorry, I
  can't tell you if it would happen to more than one hypervisor).

  Seems to happen when the nova_libvirt container is restarted.

  Environment:
  TripleO - Master
  # podman exec -it nova_libvirt rpm -qa | grep nova
  python3-nova-23.1.0-0.20210625160814.1f6c351.el8.noarch
  openstack-nova-compute-23.1.0-0.20210625160814.1f6c351.el8.noarch
  openstack-nova-common-23.1.0-0.20210625160814.1f6c351.el8.noarch
  openstack-nova-migration-23.1.0-0.20210625160814.1f6c351.el8.noarch
  python3-novaclient-17.5.0-0.20210601131008.f431295.el8.noarch

  Reproducer:
  At least for me:
  1. Start a VM
  2. Restart tripleo_nova_libvirt.service:
  systemctl restart tripleo_nova_libvirt.service
  3. All VM's are stopped

  Relevant logs:
  2021-07-22 16:31:05.532 3 DEBUG nova.compute.manager [req-19a38d0b-e019-472b-95c4-03c796040767 d2ab1d5792604ba094af82d7447e88cf c4740b2aba4147adb7f101a2782003c3 - default default] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] No wait
  ing events found dispatching network-vif-plugged-d9b29fef-cd87-41db-ba79-8b8c65b74efb pop_instance_event /usr/lib/python3.6/site-packages/nova/compute/manager.py:319
  2021-07-22 16:31:05.532 3 WARNING nova.compute.manager [req-19a38d0b-e019-472b-95c4-03c796040767 d2ab1d5792604ba094af82d7447e88cf c4740b2aba4147adb7f101a2782003c3 - default default] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Recei
  ved unexpected event network-vif-plugged-d9b29fef-cd87-41db-ba79-8b8c65b74efb for instance with vm_state active and task_state None.
  2021-07-22 16:31:30.583 3 DEBUG nova.compute.manager [req-7be814ae-0e3d-4631-8a4c-348ead46c213 - - - - -] Triggering sync for uuid b28cc3ae-6442-40cf-9d66-9d4938a567c7 _sync_power_states /usr/lib/python3.6/site-packages/nova/compute/man
  ager.py:9695                 
  2021-07-22 16:31:30.589 3 DEBUG oslo_concurrency.lockutils [-] Lock "b28cc3ae-6442-40cf-9d66-9d4938a567c7" acquired by "nova.compute.manager.ComputeManager._sync_power_states.<locals>._sync.<locals>.query_driver_power_state_and_sync" ::
   waited 0.000s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:359
  2021-07-22 16:31:30.746 3 INFO nova.compute.manager [-] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_
  state in the DB to match the hypervisor.                         
  2021-07-22 16:31:30.930 3 WARNING nova.compute.manager [-] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state:
  1, current VM power_state: 4                                       
  2021-07-22 16:31:30.931 3 DEBUG nova.compute.api [-] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Going to try to stop instance force_stop /usr/lib/python3.6/site-packages/nova/compute/api.py:2584
  2021-07-22 16:31:31.135 3 DEBUG oslo_concurrency.lockutils [-] Lock "b28cc3ae-6442-40cf-9d66-9d4938a567c7" released by "nova.compute.manager.ComputeManager._sync_power_states.<locals>._sync.<locals>.query_driver_power_state_and_sync" ::
   held 0.547s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:371
  2021-07-22 16:31:31.161 3 DEBUG oslo_concurrency.lockutils [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] Lock "b28cc3ae-6442-40cf-9d66-9d4938a567c7" acquired by "nova.compute.manager.ComputeManager.stop_instance.<locals>.do_stop_
  instance" :: waited 0.000s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:359
  2021-07-22 16:31:31.162 3 DEBUG nova.compute.manager [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Checking state _get_power_state /usr/lib/python3.6/site-packages/nova/compute/man
  ager.py:1561                                             
  2021-07-22 16:31:31.165 3 DEBUG nova.compute.manager [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Stopping instance; current vm_state: active, current task_state: powering-off, cu
  rrent DB power_state: 4, current VM power_state: 4 do_stop_instance /usr/lib/python3.6/site-packages/nova/compute/manager.py:3095
  2021-07-22 16:31:31.166 3 INFO nova.compute.manager [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Instance is already powered off in the hypervisor when stop is called.
  2021-07-22 16:31:31.168 3 DEBUG nova.objects.instance [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] Lazy-loading 'flavor' on Instance uuid b28cc3ae-6442-40cf-9d66-9d4938a567c7 obj_load_attr /usr/lib/python3.6/site-packages/nova/$
  bjects/instance.py:1103           
  2021-07-22 16:31:31.236 3 DEBUG nova.objects.instance [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] Lazy-loading 'metadata' on Instance uuid b28cc3ae-6442-40cf-9d66-9d4938a567c7 obj_load_attr /usr/lib/python3.6/site-packages/nov$
  /objects/instance.py:1103                                     
  2021-07-22 16:31:31.302 3 INFO nova.virt.libvirt.driver [req-a87509b3-9674-49df-ad1f-9f8967871e10 - - - - -] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Instance already shutdown.
  2021-07-22 16:31:31.308 3 INFO nova.virt.libvirt.driver [-] [instance: b28cc3ae-6442-40cf-9d66-9d4938a567c7] Instance destroyed successfully.

  
  I'll attach the full debug logs as well.

  I couldn't see anything on the Controller node relevant to it. Just
  seems to be the compute node, but I can capture whatever we need.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1937292/+subscriptions



References