openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #15730
instance evacuation from a failed node (rebuild for HA)
Dear all,
We have submitted a patch https://review.openstack.org/#/c/11086/ to
address https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that
simplifies recovery from a node failure by introducing an API that
recreates an instance on *another* host (similar to the existing instance
'rebuild' operation). The exact semantics of this operations varies
depending on the configuration of the instances and the underlying storage
topology. For example, if it is a regular 'ephemeral' instance, invoking
will respawn from the same image on another node while retaining the same
identity and configuration (e.g. same ID, flavor, IP, attached volumes,
etc). For instances running off shared storage (i.e. same instance file
accessible on the target host), the VM will be re-created and point to the
same instance file while retaining the identity and configuration. More
details are available at http://wiki.openstack.org/Evacuate.
Note that the API must be manually invoked today.
In addition, this patch modifies nova-compute such that on startup (e.g.,
after it failed and recovered) it verifies with the DB that it is still
the owner of an instance before starting the VM.
Would be great to hear whether people think that such a capability is
important to push into Folsom, despite the short runway till F3. Any other
thoughts/recommendations regarding such capability would be also highly
appreciated.
Thanks,
Alex
====================================================================================================
Alex Glikson
Manager, Cloud Operating System Technologies, IBM Haifa Research Lab
http://w3.haifa.ibm.com/dept/stt/cloud_sys.html |
https://www.research.ibm.com/haifa/dept/stt/cloud_sys.shtml
Email: glikson@xxxxxxxxxx | Phone: +972-4-8281085 | Mobile:
+972-54-6466667 | Fax: +972-4-8296112
Follow ups