← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1132146] [NEW] Summary: Undeletable volumes after live migration (iSCSI)

 

You have been subscribed to a public bug:


Summary: 
  iscsi client session is not removed after live migration leading to undeletable volumes (error_deleting)

Description: 
  When using cinder with iSCSI as EBS Backend and KVM as virtualization, live migrating a machine then then terminating it leads to undeletable volumes
  (volumes beeing stuck in "error_delete" state)
  
Problem: 
  After the live migration, the iSCSI Session on the source host to the cinder storage is not removed. When deleting the instance, the volume is cleared (dd) and then the iscsi session on the machine, the instance is currently running on is removed. However the iscsi session on the source host of the migration is not cleared, leading to a situation where the target on Cinder (TGTD) can not be removed as it is still "in use". 

Solution: 
  a) Force Logout of Cinder with something like the folloing before deleteing the volume: 

     tgtadm --lld iscsi --mode target --op unbind --tid=X -I ALL 
	 tgtadm --lld iscsi --mode conn --op delete --tid=X --sid Y --cid 0 
	 tgtadm --op delete --mode logicalunit --tid=X --lun 1 
	 tgtadm --lld iscsi --mode target --op delete --tid=X
	 
  b) After the migration was successfull, logout of the iSCSi Target on the source host 
  
b) is probably easier and cleaner !

doing a+b) should be very solid and error prove
  
How to Reproduce: 

1) Configure 2 Node Cluster with EBS DISK (Cinder)
2) Create a VM based on SAN Boot Volumes 
3) Migrate the machine from HOST1 to HOST2
4) Terminate the instance 
5) Delete the Volume of the instance 
6) Here the Volume is stuck in undeletable state ("error_delete")


Debug Info: 
------------

Instance in Question: instance-0000005b 
Hypervisor Hosts: ESX1 / ESX2 (running KVM, not ESX - naming fun :))

root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7

root@esx1:~# virsh list
 Id    Name                           State
----------------------------------------------------
 2     quantum                        running
 8     instance-0000005b              running

 
 root@esx1:~# virsh dumpxml instance-0000005b
<domain type='kvm' id='8'>
  <name>instance-0000005b</name>
  <uuid>4bd1426d-4972-4176-8a72-bfccd3c9035b</uuid>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>524288</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64' machine='pc-1.2'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/disk/by-path/ip-172.16.0.130:3260-iscsi-iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7-lun-1'/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:32:59:24'/>
      <source bridge='qbr6b2a850f-46'/>
      <target dev='vnet3'/>
      <model type='virtio'/>
      <filterref filter='nova-instance-instance-0000005b-fa163e325924'>
        <parameter name='DHCPSERVER' value='192.168.102.3'/>
        <parameter name='IP' value='192.168.102.4'/>
        <parameter name='PROJMASK' value='255.255.255.0'/>
        <parameter name='PROJNET' value='192.168.102.0'/>
      </filterref>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <serial type='pty'>
      <source path='/dev/pts/2'/>
      <target port='1'/>
      <alias name='serial1'/>
    </serial>
    <console type='file'>
      <source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5901' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='apparmor' relabel='yes'>
    <label>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</label>
    <imagelabel>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</imagelabel>
  </seclabel>
</domain>


root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot

root@esx2:~# virsh list
 Id    Name                           State
----------------------------------------------------


root@openstack:/etc/init.d# nova list --all-tenants
Please enter password for encrypted keyring:
+--------------------------------------+----------+--------+--------------------+
| ID                                   | Name     | Status | Networks           |
+--------------------------------------+----------+--------+--------------------+
| 4bd1426d-4972-4176-8a72-bfccd3c9035b | asdasdad | ACTIVE | lan1=192.168.102.4 |
+--------------------------------------+----------+--------+--------------------+


root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
Please enter password for encrypted keyring:
+-------------------------------------+-----------------------------------------------------------+
| Property                            | Value                                                     |
+-------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                    |
| OS-EXT-SRV-ATTR:host                | esx1                                                      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | esx1.lab.elconas.de                                       |
| OS-EXT-SRV-ATTR:instance_name       | instance-0000005b                                         |
| OS-EXT-STS:power_state              | 1                                                         |
| OS-EXT-STS:task_state               | None                                                      |
| OS-EXT-STS:vm_state                 | active                                                    |
| accessIPv4                          |                                                           |
| accessIPv6                          |                                                           |
| config_drive                        |                                                           |
| created                             | 2013-02-23T15:15:57Z                                      |
| flavor                              | m1.tiny (6)                                               |
| hostId                              | 8a48ee7c074e396d726b6da80eeacb9a7bae4bf41450e5b772cf9ff0  |
| id                                  | 4bd1426d-4972-4176-8a72-bfccd3c9035b                      |
| image                               | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
| key_name                            | rheinzmann                                                |
| lan1 network                        | 192.168.102.4                                             |
| metadata                            | {}                                                        |
| name                                | asdasdad                                                  |
| progress                            | 0                                                         |
| security_groups                     | [{u'name': u'default'}]                                   |
| status                              | ACTIVE                                                    |
| tenant_id                           | 62515dbc834241d8ab5d58ed7ea50f6b                          |
| updated                             | 2013-02-23T15:28:47Z                                      |
| user_id                             | b193e3443cd94f41bac8938f4da5a9d0                          |
+-------------------------------------+-----------------------------------------------------------+


root@openstack:/etc/init.d# nova live-migration 4bd1426d-4972-4176-8a72-bfccd3c9035b esx2
Please enter password for encrypted keyring:
root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
Please enter password for encrypted keyring:
+-------------------------------------+-----------------------------------------------------------+
| Property                            | Value                                                     |
+-------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                    |
| OS-EXT-SRV-ATTR:host                | esx2                                                      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | esx2.lab.elconas.de                                       |
| OS-EXT-SRV-ATTR:instance_name       | instance-0000005b                                         |
| OS-EXT-STS:power_state              | 1                                                         |
| OS-EXT-STS:task_state               | None                                                      |
| OS-EXT-STS:vm_state                 | active                                                    |
| accessIPv4                          |                                                           |
| accessIPv6                          |                                                           |
| config_drive                        |                                                           |
| created                             | 2013-02-23T15:15:57Z                                      |
| flavor                              | m1.tiny (6)                                               |
| hostId                              | cd6040c927390b663ce86a21e25f48228cd8dd084d05a2e826f5ac0e  |
| id                                  | 4bd1426d-4972-4176-8a72-bfccd3c9035b                      |
| image                               | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
| key_name                            | rheinzmann                                                |
| lan1 network                        | 192.168.102.4                                             |
| metadata                            | {}                                                        |
| name                                | asdasdad                                                  |
| progress                            | 0                                                         |
| security_groups                     | [{u'name': u'default'}]                                   |
| status                              | ACTIVE                                                    |
| tenant_id                           | 62515dbc834241d8ab5d58ed7ea50f6b                          |
| updated                             | 2013-02-23T15:45:21Z                                      |
| user_id                             | b193e3443cd94f41bac8938f4da5a9d0                          |
+-------------------------------------+-----------------------------------------------------------+

root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx1:~# virsh list
 Id    Name                           State
----------------------------------------------------
 2     quantum                        running


 
root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [4] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx2:~# virsh list
 Id    Name                           State
----------------------------------------------------
 3     instance-0000005b              running

root@esx2:~#

=> Terminate Instance in GUI ......

=> Instance Terminated and Deleted

=> Volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 beeing Available ...

Try to delete volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 in GUI ....

Now the "dd" takes place to remove the volume

After the "dd" the deletion has ended in "Error_Deleting"

root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7

root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot

root@esx2:~# cinder list --all-tenants
+--------------------------------------+----------------+--------------+------+-------------+-------------+
|                  ID                  |     Status     | Display Name | Size | Volume Type | Attached to |
+--------------------------------------+----------------+--------------+------+-------------+-------------+
| 332e4755-070c-45b4-b4b4-408c3fe6609b |   available    |    mytest    |  5   |     None    |             |
| df9416af-0b35-4acf-af1b-3c2593757b67 | error_deleting |              |  5   |     None    |             |
| ff447b2f-6910-4e35-a3c0-78bffe6f35d7 | error_deleting |              |  5   |     None    |             |
+--------------------------------------+----------------+--------------+------+-------------+-------------+


On Cinder: 

root@esx2:~# tgtadm --mode target --op show
...
Target 2: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
        I_T nexus: 6
            Initiator: iqn.2010-09.org.etherboot:openstack164
            Connection: 0
                IP Address: 172.16.0.120
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00020000
            SCSI SN: beaf20
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
        LUN: 1
            Type: disk
            SCSI ID: IET     00020001
            SCSI SN: beaf21
            Size: 5369 MB, Block size: 512
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
            Backing store flags:
    Account information:
    ACL information:
        ALL
	
Left Over Session:
	
root@esx2:~# tgtadm --lld iscsi --mode conn --op show --tid 2
Session: 6
    Connection: 0
        Initiator: iqn.2010-09.org.etherboot:openstack164
        IP Address: 172.16.0.120

Delete the Session

root@esx1:~# iscsiadm -m session -r 8 -u
Logging out of session [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260]
Logout of [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260] successful.

root@esx2:~# tgtadm --mode target --op delete --tid 2

root@esx2:~# lvremove /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
File descriptor 3 (/usr/share/bash-completion/completions) leaked on lvremove invocation. Parent PID 11376: -bash
Do you really want to remove active logical volume volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7? [y/n]: y
  Logical volume "volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7" successfully removed

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: cinder iscsi tgtd
-- 
Summary: Undeletable volumes after live migration (iSCSI)
https://bugs.launchpad.net/bugs/1132146
You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova).