← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1132146] Re: Summary: Undeletable volumes after live migration (iSCSI)

 

** Changed in: nova
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1132146

Title:
  Summary: Undeletable volumes after live migration (iSCSI)

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  
  Summary: 
    iscsi client session is not removed after live migration leading to undeletable volumes (error_deleting)

  Description: 
    When using cinder with iSCSI as EBS Backend and KVM as virtualization, live migrating a machine then then terminating it leads to undeletable volumes
    (volumes beeing stuck in "error_delete" state)
    
  Problem: 
    After the live migration, the iSCSI Session on the source host to the cinder storage is not removed. When deleting the instance, the volume is cleared (dd) and then the iscsi session on the machine, the instance is currently running on is removed. However the iscsi session on the source host of the migration is not cleared, leading to a situation where the target on Cinder (TGTD) can not be removed as it is still "in use". 

  Solution: 
    a) Force Logout of Cinder with something like the folloing before deleteing the volume: 

       tgtadm --lld iscsi --mode target --op unbind --tid=X -I ALL 
  	 tgtadm --lld iscsi --mode conn --op delete --tid=X --sid Y --cid 0 
  	 tgtadm --op delete --mode logicalunit --tid=X --lun 1 
  	 tgtadm --lld iscsi --mode target --op delete --tid=X
  	 
    b) After the migration was successfull, logout of the iSCSi Target on the source host 
    
  b) is probably easier and cleaner !

  doing a+b) should be very solid and error prove
    
  How to Reproduce: 

  1) Configure 2 Node Cluster with EBS DISK (Cinder)
  2) Create a VM based on SAN Boot Volumes 
  3) Migrate the machine from HOST1 to HOST2
  4) Terminate the instance 
  5) Delete the Volume of the instance 
  6) Here the Volume is stuck in undeletable state ("error_delete")

  
  Debug Info: 
  ------------

  Instance in Question: instance-0000005b 
  Hypervisor Hosts: ESX1 / ESX2 (running KVM, not ESX - naming fun :))

  root@esx1:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
  tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7

  root@esx1:~# virsh list
   Id    Name                           State
  ----------------------------------------------------
   2     quantum                        running
   8     instance-0000005b              running

   
   root@esx1:~# virsh dumpxml instance-0000005b
  <domain type='kvm' id='8'>
    <name>instance-0000005b</name>
    <uuid>4bd1426d-4972-4176-8a72-bfccd3c9035b</uuid>
    <memory unit='KiB'>524288</memory>
    <currentMemory unit='KiB'>524288</currentMemory>
    <vcpu placement='static'>1</vcpu>
    <os>
      <type arch='x86_64' machine='pc-1.2'>hvm</type>
      <boot dev='hd'/>
    </os>
    <features>
      <acpi/>
    </features>
    <cpu mode='host-model'>
      <model fallback='allow'/>
    </cpu>
    <clock offset='utc'>
      <timer name='pit' tickpolicy='delay'/>
      <timer name='rtc' tickpolicy='catchup'/>
    </clock>
    <on_poweroff>destroy</on_poweroff>
    <on_reboot>restart</on_reboot>
    <on_crash>destroy</on_crash>
    <devices>
      <emulator>/usr/bin/kvm</emulator>
      <disk type='block' device='disk'>
        <driver name='qemu' type='raw' cache='none'/>
        <source dev='/dev/disk/by-path/ip-172.16.0.130:3260-iscsi-iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7-lun-1'/>
        <target dev='vda' bus='virtio'/>
        <alias name='virtio-disk0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
      </disk>
      <controller type='usb' index='0'>
        <alias name='usb0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
      </controller>
      <interface type='bridge'>
        <mac address='fa:16:3e:32:59:24'/>
        <source bridge='qbr6b2a850f-46'/>
        <target dev='vnet3'/>
        <model type='virtio'/>
        <filterref filter='nova-instance-instance-0000005b-fa163e325924'>
          <parameter name='DHCPSERVER' value='192.168.102.3'/>
          <parameter name='IP' value='192.168.102.4'/>
          <parameter name='PROJMASK' value='255.255.255.0'/>
          <parameter name='PROJNET' value='192.168.102.0'/>
        </filterref>
        <alias name='net0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
      </interface>
      <serial type='file'>
        <source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
        <target port='0'/>
        <alias name='serial0'/>
      </serial>
      <serial type='pty'>
        <source path='/dev/pts/2'/>
        <target port='1'/>
        <alias name='serial1'/>
      </serial>
      <console type='file'>
        <source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
        <target type='serial' port='0'/>
        <alias name='serial0'/>
      </console>
      <input type='tablet' bus='usb'>
        <alias name='input0'/>
      </input>
      <input type='mouse' bus='ps2'/>
      <graphics type='vnc' port='5901' autoport='yes' listen='0.0.0.0' keymap='en-us'>
        <listen type='address' address='0.0.0.0'/>
      </graphics>
      <video>
        <model type='cirrus' vram='9216' heads='1'/>
        <alias name='video0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
      </video>
      <memballoon model='virtio'>
        <alias name='balloon0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
      </memballoon>
    </devices>
    <seclabel type='dynamic' model='apparmor' relabel='yes'>
      <label>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</label>
      <imagelabel>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</imagelabel>
    </seclabel>
  </domain>


  root@esx2:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot

  root@esx2:~# virsh list
   Id    Name                           State
  ----------------------------------------------------

  
  root@openstack:/etc/init.d# nova list --all-tenants
  Please enter password for encrypted keyring:
  +--------------------------------------+----------+--------+--------------------+
  | ID                                   | Name     | Status | Networks           |
  +--------------------------------------+----------+--------+--------------------+
  | 4bd1426d-4972-4176-8a72-bfccd3c9035b | asdasdad | ACTIVE | lan1=192.168.102.4 |
  +--------------------------------------+----------+--------+--------------------+

  
  root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
  Please enter password for encrypted keyring:
  +-------------------------------------+-----------------------------------------------------------+
  | Property                            | Value                                                     |
  +-------------------------------------+-----------------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                                    |
  | OS-EXT-SRV-ATTR:host                | esx1                                                      |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | esx1.lab.elconas.de                                       |
  | OS-EXT-SRV-ATTR:instance_name       | instance-0000005b                                         |
  | OS-EXT-STS:power_state              | 1                                                         |
  | OS-EXT-STS:task_state               | None                                                      |
  | OS-EXT-STS:vm_state                 | active                                                    |
  | accessIPv4                          |                                                           |
  | accessIPv6                          |                                                           |
  | config_drive                        |                                                           |
  | created                             | 2013-02-23T15:15:57Z                                      |
  | flavor                              | m1.tiny (6)                                               |
  | hostId                              | 8a48ee7c074e396d726b6da80eeacb9a7bae4bf41450e5b772cf9ff0  |
  | id                                  | 4bd1426d-4972-4176-8a72-bfccd3c9035b                      |
  | image                               | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
  | key_name                            | rheinzmann                                                |
  | lan1 network                        | 192.168.102.4                                             |
  | metadata                            | {}                                                        |
  | name                                | asdasdad                                                  |
  | progress                            | 0                                                         |
  | security_groups                     | [{u'name': u'default'}]                                   |
  | status                              | ACTIVE                                                    |
  | tenant_id                           | 62515dbc834241d8ab5d58ed7ea50f6b                          |
  | updated                             | 2013-02-23T15:28:47Z                                      |
  | user_id                             | b193e3443cd94f41bac8938f4da5a9d0                          |
  +-------------------------------------+-----------------------------------------------------------+

  
  root@openstack:/etc/init.d# nova live-migration 4bd1426d-4972-4176-8a72-bfccd3c9035b esx2
  Please enter password for encrypted keyring:
  root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
  Please enter password for encrypted keyring:
  +-------------------------------------+-----------------------------------------------------------+
  | Property                            | Value                                                     |
  +-------------------------------------+-----------------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                                    |
  | OS-EXT-SRV-ATTR:host                | esx2                                                      |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | esx2.lab.elconas.de                                       |
  | OS-EXT-SRV-ATTR:instance_name       | instance-0000005b                                         |
  | OS-EXT-STS:power_state              | 1                                                         |
  | OS-EXT-STS:task_state               | None                                                      |
  | OS-EXT-STS:vm_state                 | active                                                    |
  | accessIPv4                          |                                                           |
  | accessIPv6                          |                                                           |
  | config_drive                        |                                                           |
  | created                             | 2013-02-23T15:15:57Z                                      |
  | flavor                              | m1.tiny (6)                                               |
  | hostId                              | cd6040c927390b663ce86a21e25f48228cd8dd084d05a2e826f5ac0e  |
  | id                                  | 4bd1426d-4972-4176-8a72-bfccd3c9035b                      |
  | image                               | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
  | key_name                            | rheinzmann                                                |
  | lan1 network                        | 192.168.102.4                                             |
  | metadata                            | {}                                                        |
  | name                                | asdasdad                                                  |
  | progress                            | 0                                                         |
  | security_groups                     | [{u'name': u'default'}]                                   |
  | status                              | ACTIVE                                                    |
  | tenant_id                           | 62515dbc834241d8ab5d58ed7ea50f6b                          |
  | updated                             | 2013-02-23T15:45:21Z                                      |
  | user_id                             | b193e3443cd94f41bac8938f4da5a9d0                          |
  +-------------------------------------+-----------------------------------------------------------+

  root@esx1:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
  tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
  root@esx1:~# virsh list
   Id    Name                           State
  ----------------------------------------------------
   2     quantum                        running

  
   
  root@esx2:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
  tcp: [4] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
  root@esx2:~# virsh list
   Id    Name                           State
  ----------------------------------------------------
   3     instance-0000005b              running

  root@esx2:~#

  => Terminate Instance in GUI ......

  => Instance Terminated and Deleted

  => Volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 beeing Available ...

  Try to delete volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 in GUI ....

  Now the "dd" takes place to remove the volume

  After the "dd" the deletion has ended in "Error_Deleting"

  root@esx1:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
  tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7

  root@esx2:~# iscsiadm -m session
  tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot

  root@esx2:~# cinder list --all-tenants
  +--------------------------------------+----------------+--------------+------+-------------+-------------+
  |                  ID                  |     Status     | Display Name | Size | Volume Type | Attached to |
  +--------------------------------------+----------------+--------------+------+-------------+-------------+
  | 332e4755-070c-45b4-b4b4-408c3fe6609b |   available    |    mytest    |  5   |     None    |             |
  | df9416af-0b35-4acf-af1b-3c2593757b67 | error_deleting |              |  5   |     None    |             |
  | ff447b2f-6910-4e35-a3c0-78bffe6f35d7 | error_deleting |              |  5   |     None    |             |
  +--------------------------------------+----------------+--------------+------+-------------+-------------+

  
  On Cinder: 

  root@esx2:~# tgtadm --mode target --op show
  ...
  Target 2: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
      System information:
          Driver: iscsi
          State: ready
      I_T nexus information:
          I_T nexus: 6
              Initiator: iqn.2010-09.org.etherboot:openstack164
              Connection: 0
                  IP Address: 172.16.0.120
      LUN information:
          LUN: 0
              Type: controller
              SCSI ID: IET     00020000
              SCSI SN: beaf20
              Size: 0 MB, Block size: 1
              Online: Yes
              Removable media: No
              Readonly: No
              Backing store type: null
              Backing store path: None
              Backing store flags:
          LUN: 1
              Type: disk
              SCSI ID: IET     00020001
              SCSI SN: beaf21
              Size: 5369 MB, Block size: 512
              Online: Yes
              Removable media: No
              Readonly: No
              Backing store type: rdwr
              Backing store path: /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
              Backing store flags:
      Account information:
      ACL information:
          ALL
  	
  Left Over Session:
  	
  root@esx2:~# tgtadm --lld iscsi --mode conn --op show --tid 2
  Session: 6
      Connection: 0
          Initiator: iqn.2010-09.org.etherboot:openstack164
          IP Address: 172.16.0.120

  Delete the Session

  root@esx1:~# iscsiadm -m session -r 8 -u
  Logging out of session [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260]
  Logout of [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260] successful.

  root@esx2:~# tgtadm --mode target --op delete --tid 2

  root@esx2:~# lvremove /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
  File descriptor 3 (/usr/share/bash-completion/completions) leaked on lvremove invocation. Parent PID 11376: -bash
  Do you really want to remove active logical volume volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7? [y/n]: y
    Logical volume "volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7" successfully removed

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1132146/+subscriptions