← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1484125] [NEW] storage device remains attached to instance even if attach failed and volume deleted

 

Public bug reported:

If I do the following:

    * create new nova instance (basic i.e. just root disk, no ephemeral)

    * create new cinder volume (lvm in this case)

    * I then place a breakpoint in nova/virt/block_device.py [1] between
nova attaching the disk to the instance and marking the volume as in-use
in cinder

   * while nova-compute is paused i stop the cinder api then continue
nova-compute

The result is that nova-compute gets "ConnectionRefused: Unable to
establish connection to [...]" and errors out of the attach. I then
restart the cinder-api and check volume status which is 'attaching'. if
I do 'cinder reset-state <vol>' to return the volume to available (so i
can delete it) or i do a force-delete, the volume goes away along with
the underlying volume but the disk is still attached to the instance
i.e. in virsh i still have:

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/disk/by-path/ip-192.168.122.152:3260-iscsi-iqn.2010-10.org.openstack:volume-f9a90f02-9f8b-41af-9010-121a098d5a55-lun-1'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <serial>f9a90f02-9f8b-41af-9010-121a098d5a55</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

Basically because we don't cleanup if cinder api calls fail. The result
is that there is no way to cleanup without manually intervention. This
could easily be avoided by simply doing a
volume_api.terminate_connection() if the subsequent volume api calls
fail.

[1]
https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L278

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: sts

** Description changed:

- If it do the following:
+ If I do the following:
  
-     * create new nova instance (basic i.e. just root disk, no ephemeral)
+     * create new nova instance (basic i.e. just root disk, no ephemeral)
  
-     * create new cinder volume (lvm in this case)
+     * create new cinder volume (lvm in this case)
  
-     * I then place a breakpoint between nova attaching the disk to the
+     * I then place a breakpoint between nova attaching the disk to the
  instance and marking the volume as in-use in cinder [1]
  
-    * while nova-compute is paused i stop the cinder api then continue
+    * while nova-compute is paused i stop the cinder api then continue
  nova-compute
  
  The result is that nova-compute gets "ConnectionRefused: Unable to
  establish connection to [...]" and errors out of the attach. I then
  restart the cinder-api and check volume status which is 'attaching'. if
  I do 'cinder reset-state <vol>' to return the volume to available (so i
  can delete it) or i do a force-delete, the volume goes away along with
  the underlying volume but the disk is still attached to the instance
  i.e. in virsh i still have:
  
-     <disk type='block' device='disk'>
-       <driver name='qemu' type='raw' cache='none'/>
-       <source dev='/dev/disk/by-path/ip-192.168.122.152:3260-iscsi-iqn.2010-10.org.openstack:volume-f9a90f02-9f8b-41af-9010-121a098d5a55-lun-1'/>
-       <backingStore/>
-       <target dev='vdb' bus='virtio'/>
-       <serial>f9a90f02-9f8b-41af-9010-121a098d5a55</serial>
-       <alias name='virtio-disk1'/>
-       <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
-     </disk>
+     <disk type='block' device='disk'>
+       <driver name='qemu' type='raw' cache='none'/>
+       <source dev='/dev/disk/by-path/ip-192.168.122.152:3260-iscsi-iqn.2010-10.org.openstack:volume-f9a90f02-9f8b-41af-9010-121a098d5a55-lun-1'/>
+       <backingStore/>
+       <target dev='vdb' bus='virtio'/>
+       <serial>f9a90f02-9f8b-41af-9010-121a098d5a55</serial>
+       <alias name='virtio-disk1'/>
+       <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
+     </disk>
  
  Basically because we don't cleanup if cinder api calls fail. The result
  is that there is no way to cleanup without manually intervention. This
  could easily be avoided by simply doing a
  volume_api.terminate_connection() if the subsequent volume api calls
  fail.

** Description changed:

  If I do the following:
  
      * create new nova instance (basic i.e. just root disk, no ephemeral)
  
      * create new cinder volume (lvm in this case)
  
      * I then place a breakpoint between nova attaching the disk to the
  instance and marking the volume as in-use in cinder [1]
  
     * while nova-compute is paused i stop the cinder api then continue
  nova-compute
  
  The result is that nova-compute gets "ConnectionRefused: Unable to
  establish connection to [...]" and errors out of the attach. I then
  restart the cinder-api and check volume status which is 'attaching'. if
  I do 'cinder reset-state <vol>' to return the volume to available (so i
  can delete it) or i do a force-delete, the volume goes away along with
  the underlying volume but the disk is still attached to the instance
  i.e. in virsh i still have:
  
      <disk type='block' device='disk'>
        <driver name='qemu' type='raw' cache='none'/>
        <source dev='/dev/disk/by-path/ip-192.168.122.152:3260-iscsi-iqn.2010-10.org.openstack:volume-f9a90f02-9f8b-41af-9010-121a098d5a55-lun-1'/>
        <backingStore/>
        <target dev='vdb' bus='virtio'/>
        <serial>f9a90f02-9f8b-41af-9010-121a098d5a55</serial>
        <alias name='virtio-disk1'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
      </disk>
  
  Basically because we don't cleanup if cinder api calls fail. The result
  is that there is no way to cleanup without manually intervention. This
  could easily be avoided by simply doing a
  volume_api.terminate_connection() if the subsequent volume api calls
  fail.
+ 
+ [1]
+ https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L278

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1484125

Title:
  storage device remains attached to instance even if attach failed and
  volume deleted

Status in OpenStack Compute (nova):
  New

Bug description:
  If I do the following:

      * create new nova instance (basic i.e. just root disk, no
  ephemeral)

      * create new cinder volume (lvm in this case)

      * I then place a breakpoint in nova/virt/block_device.py [1]
  between nova attaching the disk to the instance and marking the volume
  as in-use in cinder

     * while nova-compute is paused i stop the cinder api then continue
  nova-compute

  The result is that nova-compute gets "ConnectionRefused: Unable to
  establish connection to [...]" and errors out of the attach. I then
  restart the cinder-api and check volume status which is 'attaching'.
  if I do 'cinder reset-state <vol>' to return the volume to available
  (so i can delete it) or i do a force-delete, the volume goes away
  along with the underlying volume but the disk is still attached to the
  instance i.e. in virsh i still have:

      <disk type='block' device='disk'>
        <driver name='qemu' type='raw' cache='none'/>
        <source dev='/dev/disk/by-path/ip-192.168.122.152:3260-iscsi-iqn.2010-10.org.openstack:volume-f9a90f02-9f8b-41af-9010-121a098d5a55-lun-1'/>
        <backingStore/>
        <target dev='vdb' bus='virtio'/>
        <serial>f9a90f02-9f8b-41af-9010-121a098d5a55</serial>
        <alias name='virtio-disk1'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
      </disk>

  Basically because we don't cleanup if cinder api calls fail. The
  result is that there is no way to cleanup without manually
  intervention. This could easily be avoided by simply doing a
  volume_api.terminate_connection() if the subsequent volume api calls
  fail.

  [1]
  https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L278

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1484125/+subscriptions


Follow ups