yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1403856] [NEW] VMware VCDriver: A node crash, vSphere HA and badly timed _sync_power_states() will shut instances down

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Toni Ylenius <toni.ylenius@xxxxxxxxxxxx>
Date: Thu, 18 Dec 2014 12:15:59 -0000
Reply-to: Bug 1403856 <1403856@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

The release: Icehouse, however the code in juno seems to same

When a VMware node crashes, the instances will be restarted on a new
node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
"Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.

To my understanding to fix this we need either
1. change the logic (I don't have ideas unfortunately) or
2. add a config option that states if we force stop or not when an instance is stopped from the database point of view.

** Affects: nova
Importance: Undecided
Status: New

** Description changed:

- The release: Icehouse, however the code juno seems to same
+ The release: Icehouse, however the code in juno seems to same

When node crashes and the instances in VMware side will be restarted on
a new node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
"Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.

To my understanding to fix this we need either
1. change the logic (I don't have ideas unfortunately) or
2. add a config option that states if we force stop or not when instance is stopped from the database point of view.

** Description changed:

The release: Icehouse, however the code in juno seems to same

- When node crashes and the instances in VMware side will be restarted on
- a new node because of vSphere HA.
+ When a node crashes, the instances in VMware side will be restarted on a
+ new node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

** Description changed:

The release: Icehouse, however the code in juno seems to same

- When a node crashes, the instances in VMware side will be restarted on a
- new node because of vSphere HA.
+ When a VMware node crashes, the instances will be restarted on a new
+ node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

** Description changed:

The release: Icehouse, however the code in juno seems to same

When a VMware node crashes, the instances will be restarted on a new
node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

- On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
+ On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
"Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.

** Description changed:

The release: Icehouse, however the code in juno seems to same

When a VMware node crashes, the instances will be restarted on a new
node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

To my understanding to fix this we need either
1. change the logic (I don't have ideas unfortunately) or
- 2. add a config option that states if we force stop or not when instance is stopped from the database point of view.
+ 2. add a config option that states if we force stop or not when an instance is stopped from the database point of view.

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1403856

Title:
VMware VCDriver: A node crash, vSphere HA and badly timed
_sync_power_states() will shut instances down

Status in OpenStack Compute (Nova):
New

Bug description:
The release: Icehouse, however the code in juno seems to same

When a VMware node crashes, the instances will be restarted on a new
node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1403856/+subscriptions

Follow ups

[Bug 1403856] Re: VMware VCDriver: A node crash, vSphere HA and badly timed _sync_power_states() will shut instances down
From: Markus Zoeller (markus_z), 2016-07-05
[Bug 1403856] [NEW] VMware VCDriver: A node crash, vSphere HA and badly timed _sync_power_states() will shut instances down
From: Toni Ylenius, 2014-12-18

References

[Bug 1403856] [NEW] VMware VCDriver: A node crash, vSphere HA and badly timed _sync_power_states() will shut instances down
From: Toni Ylenius, 2014-12-18