← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1838392] [NEW] BDMNotFound raised and stale block devices left over when simultaneously reboot and deleting an instance

 

Public bug reported:

Description
===========
Simultaneous requests to reboot and delete an instance _will_ race as only the call to delete takes a lock against the instance.uuid.

One possible outcome of this seen in the wild with the Libvirt driver is
that the request to soft reboot will eventually turn into a hard reboot,
reconnecting volumes that the delete request has already disconnected.
These volumes will eventually be unmapped on the Cinder side by the
delete request leaving stale devices on the host. Additionally
BDMNotFound is raised by the reboot operation as the delete operation
has already deleted the BDMs.

Steps to reproduce
==================
$ nova reboot $instance && nova delete $instance

Expected result
===============
The instance reboots and is then deleted without any errors raised.

Actual result
=============
BDMNotFound raised and stale block devices left over.

Environment
===========
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

1599e3cf68779eafaaa2b13a273d3bebd1379c19 / 19.0.0.0rc1-992-g1599e3cf68

2. Which hypervisor did you use?
   (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
   What's the version of that?

   Libvirt + QEMU/kvm

2. Which storage type did you use?
   (For example: Ceph, LVM, GPFS, ...)
   What's the version of that?

   N/A

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)

   N/A

Logs & Configs
==============

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: libvirt reboot volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838392

Title:
  BDMNotFound raised and stale block devices left over when
  simultaneously reboot and deleting an instance

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  Simultaneous requests to reboot and delete an instance _will_ race as only the call to delete takes a lock against the instance.uuid.

  One possible outcome of this seen in the wild with the Libvirt driver
  is that the request to soft reboot will eventually turn into a hard
  reboot, reconnecting volumes that the delete request has already
  disconnected. These volumes will eventually be unmapped on the Cinder
  side by the delete request leaving stale devices on the host.
  Additionally BDMNotFound is raised by the reboot operation as the
  delete operation has already deleted the BDMs.

  Steps to reproduce
  ==================
  $ nova reboot $instance && nova delete $instance

  Expected result
  ===============
  The instance reboots and is then deleted without any errors raised.

  Actual result
  =============
  BDMNotFound raised and stale block devices left over.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

  1599e3cf68779eafaaa2b13a273d3bebd1379c19 / 19.0.0.0rc1-992-g1599e3cf68

  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

     Libvirt + QEMU/kvm

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

     N/A

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

     N/A

  Logs & Configs
  ==============

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1838392/+subscriptions


Follow ups