← Back to team overview

openstack team mailing list archive

Re: Rebuild instance from failed host

 

Hi Paul,



I agree that changing image/flavor is not truly required, but like I said
below it is not the primary intended use case. The usecase is we want to
move an instance with same volume/ip/identity to move to another host when
the original host has failed.



--Shyam



*From:* Paul Voccio [mailto:paul.voccio@xxxxxxxxxxxxx]
*Sent:* Tuesday, November 29, 2011 11:58 AM
*To:* Shyam Kaushik; openstack@xxxxxxxxxxxxxxxxxxx
*Subject:* Re: [Openstack] Rebuild instance from failed host



Hi Shyam,



The initial use case sounds reasonable but I'm not sure I see the need to
change flavor and image ref. That is essentially a new instance with the
same ip and volume? Is there a real need there then? I might think that
adding some logic around a rebuild that has a failed host that would know
to rebuild on another host would keep in line with 'rebuild' logic. If it
changed all of the data about the instance it doesn’t feel much like a
'rebuild' anymore.



Just my thoughts.

pvo



*From: *Shyam Kaushik <shyam@xxxxxxxxxxxxxxxxx>
*Date: *Tue, 29 Nov 2011 11:34:08 +0530
*To: *<openstack@xxxxxxxxxxxxxxxxxxx>
*Subject: *[Openstack] Rebuild instance from failed host



Hi Folks,



Today in openstack, “rebuild” instance tears down a running instance & sets
up a fresh instance in its place on the same host. “resize” instance
migrates the underlying instance disk to another physical host and spawns
the instance there. However both these options require that the origin host
that was running the instance to be up for the operations to work. If the
host is failed (could be irrecoverable if the root FS is corrupted), we
cannot recover the instance. All operations on that instance would fail.



We want to introduce a new “rebuild instance from failed host” operation
whereby we could rebuild the instance on another host with the same
properties (instance-id, name, network info, metadata, volume attachments)
and mark the old instance on failed host for cleanup. Whenever the failed
host comes up, it will clear cache for the old instance. This operation is
essentially a modified form of today’s “rebuild” instance, in terms of
allowing to rebuild the instance even if the underlying host has failed.



Essentially the “rebuild instance from failed host” will do the following
steps:
# See if it can terminate running instance on existing host. If not create
a migration record

# Change “host” for instance to a new host (picked up by scheduler) & spawn
the instance on that host (with volume attachments, networks connected as
it was with the original instance)

# Optionally during this procedure allow instance flavor to be changed +
possibility to give a different “image reference” for it to bootup (could
possibly be used to upgrade OS image of the instance during this procedure).

# Whenever the failed host comes up, it will read through migration records
(as part of init_host), clear up its cache & mark the migration complete.



Note that this procedure could also be used for Upgrading image versions +
changing instance flavors even when the origin host is alive, but that is
not the primary intended use case.



Question is, is this a reasonable proposal to go forward? If not, are there
any other alternative procedures available to meet the requirement?



If this is a reasonable proposal to go forward, I will submit a blueprint &
follow-up with implementation.



Thanks.



--Shyam

_______________________________________________ Mailing list:
https://launchpad.net/~openstack Post to :
openstack@lists.launchpad.netUnsubscribe :
https://launchpad.net/~openstack More help :
https://help.launchpad.net/ListHelp

------------------------------

No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1873 / Virus Database: 2101/4645 - Release Date: 11/28/11

References