yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1542039] [NEW] nova should not reschedule an instance that has already been deleted

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Chris Friesen <chris.friesen@xxxxxxxxxxxxx>
Date: Thu, 04 Feb 2016 21:45:48 -0000
Reply-to: Bug 1542039 <1542039@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Public bug reported:

I'm investigating an issue where an instance with a large disk and an
attached cinder volume was booted in a stable/kilo OpenStack setup with
the diskFilter disabled.

The timeline looks like this:
scheduler picks initial compute node
nova attempts to boot it up on one compute node, it runs out of disk space and gets rescheduled
 scheduler picks another compute node
user requests instance deletion
user requests cinder volume deletion
nova attempts to boot it up on second compute node, it runs out of disk space and gets rescheduled
scheduler picks a third compute node
nova  attempts to boot it up on third compute node, runs into problems due to missing cinder volume


The issue I want to address in this bug is whether it makes sense to reschedule the instance when the instance has already been deleted.

Also, instance deletion sets the task_state to 'deleting' early on.  In
compute.manager.ComputeManager._do_build_and_run_instance(), if we
decide to reschedule then nova-compute will set the task_state to
'scheduling' and then save the instance, which I think could overwrite
the 'deleting' state in the DB.

So...would it make sense to have nova-compute put an
"expected_task_state" on the instance.save() call that sets the
'scheduling' task_state?

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: compute

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1542039

Title:
  nova should not reschedule an instance that has already been deleted

Status in OpenStack Compute (nova):
  New

Bug description:
  I'm investigating an issue where an instance with a large disk and an
  attached cinder volume was booted in a stable/kilo OpenStack setup
  with the diskFilter disabled.

  The timeline looks like this:
  scheduler picks initial compute node
  nova attempts to boot it up on one compute node, it runs out of disk space and gets rescheduled
   scheduler picks another compute node
  user requests instance deletion
  user requests cinder volume deletion
  nova attempts to boot it up on second compute node, it runs out of disk space and gets rescheduled
  scheduler picks a third compute node
  nova  attempts to boot it up on third compute node, runs into problems due to missing cinder volume

  
  The issue I want to address in this bug is whether it makes sense to reschedule the instance when the instance has already been deleted.

  Also, instance deletion sets the task_state to 'deleting' early on.
  In compute.manager.ComputeManager._do_build_and_run_instance(), if we
  decide to reschedule then nova-compute will set the task_state to
  'scheduling' and then save the instance, which I think could overwrite
  the 'deleting' state in the DB.

  So...would it make sense to have nova-compute put an
  "expected_task_state" on the instance.save() call that sets the
  'scheduling' task_state?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1542039/+subscriptions