← Back to team overview

openstack team mailing list archive

Re: Can't delete instances with "error" status.

 

Here’s the way we’ve approached this:


-          A user can always send a delete request for a VM in any state (this is the only action that is always allowed).

-          Once a VM has a task_state of “Deleting” (set in the API server) the only action they can perform is delete

o   Hence at this point we can stop billing for it, and the user shouldn’t have it counted in their quota

-          A common reason for VMs getting stuck in Deleting is that the compute manager is restarted (or fails) �C so we have added code to the computer manager start-up to check for instances with a task_state of deleting and delete them (this needs to be able to cope with various exceptions if the delete was part way thought).   Since the manage is restarting we can be sure that the eventlet that was handling the delete isn’t doing it anymore ;-)

So from the user perspective we honour their request to delete VMs, make sure they can’t change their mind, and try to cleanup eventually as part of compute_manager restart.


-          By the same logic we also reset the “Image_Snapshot” and “Image_backup” task_state, as we know they aren’t true anymore.

-          It would be possible to also handle other task_states such as “rebuilding” and “rebooting”, but we haven’t tried that.

Phil

From: openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx [mailto:openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Guilherme Birk
Sent: 22 March 2012 13:24
To: Openstack Mail List
Subject: Re: [Openstack] Can't delete instances with "error" status.

Gabe, responding to your question "Do you know how to reliably reproduce an instance in ERROR state that cannot be deleted?":
In my case, I'm updating the status of the VM to "error" directly on the database. This is just for testing. But even when my VM is running and working fine, when I update it to "error" in the database I can't delete it.
> From: gabe.westmaas@xxxxxxxxxxxxx<mailto:gabe.westmaas@xxxxxxxxxxxxx>
> To: cp16net@xxxxxxxxx<mailto:cp16net@xxxxxxxxx>
> Date: Thu, 22 Mar 2012 06:49:15 +0000
> CC: openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Subject: Re: [Openstack] Can't delete instances with "error" status.
>
> There are definitely lots of cases where deleting an instance in error state works fine, and I’d like to know about the cases where it doesn’t. They do count against quota as well, so that’s a problem!
>
> I can see value in keeping an instance around �C if the operations team has configured it to do so. However, it seems like the end user has asked for it to be deleted and to them it should appear to be deleted. Johannes had an idea a while ago to allow the operations team to specify a project ID that deleted servers that match certain criteria should be moved to. If the delete finishes up fine, then its no problem, the delete is done, the customer is happy and the ops account is empty. If it fails for some reason, there is manual cleanup to be done, but that should be on the operator of the deployment, not the user. I think its critical for anything like this to be configurable, as public clouds and private clouds have different privacy and retention concerns, I would guess.
>
> Do we have cases where we can reliably reproduce this issue? If its happening every time on some deployments there is a very serious problem!
>
> Gabe
>
>
> From: Craig Vyvial [mailto:cp16net@xxxxxxxxx]<mailto:[mailto:cp16net@xxxxxxxxx]>
> Sent: Thursday, March 22, 2012 2:20 AM
> To: Ga be Westmaas
> Cc: Yong Sheng Gong; Openstack Mail List
> Subject: Re: [Openstack] Can't delete instances with "error" status.
>
> My understanding is that you would not always want a user to delete an instance in an error state. So an operations person can figure out what went wrong. I think the instances that are in error state do not count against the quota but i agree that they clutter up the API calls to list servers.
>
> I have noticed this with my team and written code around this case to force the instance into an 'active' state before sending nova the delete call if the instance was in an 'error' or 'suspended' state.
>
> -Craig
>
> On Thu, Mar 22, 2012 at 1:02 AM, Gabe Westmaas <gabe.westmaas@xxxxxxxxxxxxx<mailto:gabe.westmaas@xxxxxxxxxxxxx>> wrote:
> Instances in deleted status can normally be deleted, but there is definitely a bug to file here somewhere �C possibly more than one.  A common reason I have seen is that the node the instance lives on is no longer operating correctly, so the compute manager never gets the delete request, so it doesn’t finish.  If we can narrow the cases where this happens, we can file bugs and decide how to resolve them �C although there may be some additional work beyond just a developer picking up the bug and working on it to decide what should happen!
>
> Do you know how to reliably reproduce an instance in ERROR state that cannot be deleted?
>
> Gabe
>
> From: openstack-bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx<mailto:openstack-bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx> [mailto:openstack-bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx]<mailto:[mailto:openstack-bounces+gabe.westmaas=rackspace.com@xxxxxxxxxxxxxxxxxxx]> On Behalf Of Yong Sheng Gong
> Sent: Thursday, March 22, 2012 12:58 AM
>
> To: Openstack Mail List
> Subject: Re: [Openstack] Can't delete instances with "error" status.
>
> why not allow "nova delete" and "euca-terminate " to delete the inst ance with "error" status?
>
> Yong Sheng Gong
>
> -----openstack-bounces+gongysh=cn.ibm.com@xxxxxxxxxxxxxxxxxxx<mailto:-----openstack-bounces+gongysh=cn.ibm.com@xxxxxxxxxxxxxxxxxxx> wrote: -----
> To: Guilherme Birk <guibirk@xxxxxxxxxxx<mailto:guibirk@xxxxxxxxxxx>>, Openstack Mail List <openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>>
> From: Mandar Vaze <mandar.vaze@xxxxxxxxxxxx<mailto:mandar.vaze@xxxxxxxxxxxx>>
> Sent by: openstack-bounces+gongysh=cn.ibm.com@xxxxxxxxxxxxxxxxxxx<mailto:openstack-bounces+gongysh=cn.ibm.com@xxxxxxxxxxxxxxxxxxx>
> Date: 03/22/2012 12:26PM
> Subject: Re: [Openstack] Can't delete instances with "error" status.
> I agree user shouldn’t have to update DB.
> There needs to be some periodic cleanup task that “fixes” the vm_state/task_state for “stuck” instances.
>
> -Mandar
>
> From: openstack-bounces+mandar.vaze=vertex.co.in@xxxxxxxxxxxxxxxxxxx<mailto:openstack-bounces+mandar.vaze=vertex.co.in@xxxxxxxxxxxxxxxxxxx> [mailto:openstack-bounces+mandar.vaze=vertex.co.in@xxxxxxxxxxxxxxxxxxx]<mailto:[mailto:openstack-bounces+mandar.vaze=vertex.co.in@xxxxxxxxxxxxxxxxxxx]> On Behalf Of Guilherme Birk
> Sent: Wednesday, March 21, 2012 6:05 PM
> To: Openstack Mail List
> Subject: Re: [Openstack] Can't delete instances with "error" status.
>
> This is the only option? I've already done that, but it's kind "strange" I update the database row everytime the script identifies a VM with error.
> ________________________________________
> Date: Wed, 21 Mar 2012 09:39:03 +0800
> Subject: Re: [Openstack] Can't delete instances with "error" status.
> From: mwjpiero@xxxxxxxxx<mailto:mwjpiero@xxxxxxxxx>
> To: guibirk@xxxxxxxxxxx<mailto:guibirk@xxxxxxxxxxx>
> update "instances" table in "nova" db, set the vm_status of  the instance which you want to delete "active" and set the task_status=NULL.
> After that, try to use euca-terminate
>
> On Wed, Mar 21, 2012 at 1:24 AM, Guilherme Birk <guibirk@xxxxxxxxxxx<mailto:guibirk@xxxxxxxxxxx>> wrote:
> I'm attempting to make a python script that controls all my virtual machines. Sometimes, when the script identifies that exists an instance with status of "error", he creates a new instance and tries to delete the old one with curl commands, but I'm not getting any response and the VM isn't deleted. When I execute euca-terminate instance <i-name> I got nothing too. How I should delete instances with error status ? I didn't found any way using nova-manage too.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
>
> --
> 非淡薄无以明志,非宁静无以致远
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

References