yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93273
[Bug 2048848] [NEW] get_power_state blocked
Public bug reported:
Description
-----------
When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately.
Steps to reproduce
------------------
1. Disconnect the network for the rbd storage.
2. Schedule `domstats` to run every 10 seconds.
Expected result
---------------
The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions.
** Affects: nova
Importance: Undecided
Assignee: Yalei Li (chetaiyong)
Status: New
** Changed in: nova
Assignee: (unassigned) => Yalei Li (chetaiyong)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2048848
Title:
get_power_state blocked
Status in OpenStack Compute (nova):
New
Bug description:
Description
-----------
When the network for an rbd (RADOS Block Device) storage disconnects due to a failure, `get_power_state` becomes blocked when attempting to query the power state of a virtual machine. The goal is to check the power status and migrate online VMs. However, when the periodic monitoring program `domstats` hangs while accessing the disconnected storage, it causes libvirt's rpc-worker to be occupied for extended periods. In scenarios with multiple virtual machines, querying the power status interface also gets delayed and cannot be executed immediately.
Steps to reproduce
------------------
1. Disconnect the network for the rbd storage.
2. Schedule `domstats` to run every 10 seconds.
Expected result
---------------
The expected outcome is to switch to a higher-priority interface within libvirt, such as using `domain.state()` possibly in conjunction with a priority RPC mechanism like `prio-rpc`. This would ensure that critical operations, including querying power states and conducting necessary migrations, are prioritized and can still be executed promptly even under resource-constrained conditions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2048848/+subscriptions