← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1403856] [NEW] VMware VCDriver: A node crash, vSphere HA and badly timed _sync_power_states() will shut instances down

 

Public bug reported:

The release: Icehouse, however the code in juno seems to same

When a VMware node crashes, the instances will be restarted on a new
node because of vSphere HA.

If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
"Instance shutdown by itself. Calling the stop API."

On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
"Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.

To my understanding to fix this we need either
1. change the logic (I don't have ideas unfortunately) or
2. add a config option that states if we force stop or not when an instance is stopped from the database point of view.

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

- The release: Icehouse, however the code juno seems to same
+ The release: Icehouse, however the code in juno seems to same
  
  When node crashes and the instances in VMware side will be restarted on
  a new node because of vSphere HA.
  
  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."
  
  On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.
  
  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
  2. add a config option that states if we force stop or not when instance is stopped from the database point of view.

** Description changed:

  The release: Icehouse, however the code in juno seems to same
  
- When node crashes and the instances in VMware side will be restarted on
- a new node because of vSphere HA.
+ When a node crashes, the instances in VMware side will be restarted on a
+ new node because of vSphere HA.
  
  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."
  
  On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.
  
  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
  2. add a config option that states if we force stop or not when instance is stopped from the database point of view.

** Description changed:

  The release: Icehouse, however the code in juno seems to same
  
- When a node crashes, the instances in VMware side will be restarted on a
- new node because of vSphere HA.
+ When a VMware node crashes, the instances will be restarted on a new
+ node because of vSphere HA.
  
  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."
  
  On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.
  
  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
  2. add a config option that states if we force stop or not when instance is stopped from the database point of view.

** Description changed:

  The release: Icehouse, however the code in juno seems to same
  
  When a VMware node crashes, the instances will be restarted on a new
  node because of vSphere HA.
  
  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."
  
- On next _sync_power_states() run Nova will notice that power state is changed and will shut the instances down and print
+ On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.
  
  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
  2. add a config option that states if we force stop or not when instance is stopped from the database point of view.

** Description changed:

  The release: Icehouse, however the code in juno seems to same
  
  When a VMware node crashes, the instances will be restarted on a new
  node because of vSphere HA.
  
  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."
  
  On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.
  
  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
- 2. add a config option that states if we force stop or not when instance is stopped from the database point of view.
+ 2. add a config option that states if we force stop or not when an instance is stopped from the database point of view.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1403856

Title:
  VMware VCDriver: A node crash, vSphere HA and badly timed
  _sync_power_states() will shut instances down

Status in OpenStack Compute (Nova):
  New

Bug description:
  The release: Icehouse, however the code in juno seems to same

  When a VMware node crashes, the instances will be restarted on a new
  node because of vSphere HA.

  If _sync_power_states() is triggered just after node crash, the vms are down and Nova will update the database and print
  "Instance shutdown by itself. Calling the stop API."

  On next _sync_power_states() run, Nova will notice that power state is changed and will shut the instances down and print
  "Instance is not stopped. Calling the stop API.". This happens because vSphere HA has started instances meanwhile.

  To my understanding to fix this we need either
  1. change the logic (I don't have ideas unfortunately) or
  2. add a config option that states if we force stop or not when an instance is stopped from the database point of view.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1403856/+subscriptions


Follow ups

References