← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2048848] [NEW] get_power_state blocked

 

Public bug reported:

Description
-----------
When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately.

Steps to reproduce
------------------
1. Disconnect the network for the rbd storage.
2. Schedule `domstats` to run every 10 seconds.

Expected result
---------------
The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions.

** Affects: nova
     Importance: Undecided
     Assignee: Yalei Li (chetaiyong)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Yalei Li (chetaiyong)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2048848

Title:
  get_power_state blocked

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  -----------
  When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately.

  Steps to reproduce
  ------------------
  1. Disconnect the network for the rbd storage.
  2. Schedule `domstats` to run every 10 seconds.

  Expected result
  ---------------
  The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2048848/+subscriptions