← Back to team overview

registry team mailing list archive

[Bug 661214] [NEW] When a node dies, its instances should be marked !running

 

Public bug reported:

It's the owning node's responsibility to change the state of instances,
but if the node dies, this obviously doesn't happen.

There a multiple scenarios here:

1) Nova on the host has crashed, but the VM's are still alive.
2) The machine has died and taken nova and the vm's with it to the grave.
3) Nothing is wrong with neither nova, nor the vm's, but the network connection has been severed, so we can't tell.

For 1) we need a notification mechanism of sorts. A really simple (to minimise potential for crashes) agent should be monitoring the components and raising an alert in case of failure or just try restarting nova.
For 2) we need to at the very least mark the instances as not running anymore. To make this happen, something must look through the list of registered compute nodes and see if they've failed to provide a heartbeat recently, and mark their VM's accordingly.
3) is more involved. We'll need a big discussion about network partitioning (in CAP parlance) and such at some point, and the outcome of that will likely make this pretty straightforward. Here's hoping.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
When a node dies, its instances should be marked !running
https://bugs.launchpad.net/bugs/661214
You received this bug notification because you are a member of Registry
Administrators, which is subscribed to OpenStack.



Follow ups

References