yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #04742
[Bug 1132146] Re: Summary: Undeletable volumes after live migration (iSCSI)
** Changed in: nova
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1132146
Title:
Summary: Undeletable volumes after live migration (iSCSI)
Status in OpenStack Compute (Nova):
Fix Released
Bug description:
Summary:
iscsi client session is not removed after live migration leading to undeletable volumes (error_deleting)
Description:
When using cinder with iSCSI as EBS Backend and KVM as virtualization, live migrating a machine then then terminating it leads to undeletable volumes
(volumes beeing stuck in "error_delete" state)
Problem:
After the live migration, the iSCSI Session on the source host to the cinder storage is not removed. When deleting the instance, the volume is cleared (dd) and then the iscsi session on the machine, the instance is currently running on is removed. However the iscsi session on the source host of the migration is not cleared, leading to a situation where the target on Cinder (TGTD) can not be removed as it is still "in use".
Solution:
a) Force Logout of Cinder with something like the folloing before deleteing the volume:
tgtadm --lld iscsi --mode target --op unbind --tid=X -I ALL
tgtadm --lld iscsi --mode conn --op delete --tid=X --sid Y --cid 0
tgtadm --op delete --mode logicalunit --tid=X --lun 1
tgtadm --lld iscsi --mode target --op delete --tid=X
b) After the migration was successfull, logout of the iSCSi Target on the source host
b) is probably easier and cleaner !
doing a+b) should be very solid and error prove
How to Reproduce:
1) Configure 2 Node Cluster with EBS DISK (Cinder)
2) Create a VM based on SAN Boot Volumes
3) Migrate the machine from HOST1 to HOST2
4) Terminate the instance
5) Delete the Volume of the instance
6) Here the Volume is stuck in undeletable state ("error_delete")
Debug Info:
------------
Instance in Question: instance-0000005b
Hypervisor Hosts: ESX1 / ESX2 (running KVM, not ESX - naming fun :))
root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx1:~# virsh list
Id Name State
----------------------------------------------------
2 quantum running
8 instance-0000005b running
root@esx1:~# virsh dumpxml instance-0000005b
<domain type='kvm' id='8'>
<name>instance-0000005b</name>
<uuid>4bd1426d-4972-4176-8a72-bfccd3c9035b</uuid>
<memory unit='KiB'>524288</memory>
<currentMemory unit='KiB'>524288</currentMemory>
<vcpu placement='static'>1</vcpu>
<os>
<type arch='x86_64' machine='pc-1.2'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
</features>
<cpu mode='host-model'>
<model fallback='allow'/>
</cpu>
<clock offset='utc'>
<timer name='pit' tickpolicy='delay'/>
<timer name='rtc' tickpolicy='catchup'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/disk/by-path/ip-172.16.0.130:3260-iscsi-iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7-lun-1'/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<controller type='usb' index='0'>
<alias name='usb0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<interface type='bridge'>
<mac address='fa:16:3e:32:59:24'/>
<source bridge='qbr6b2a850f-46'/>
<target dev='vnet3'/>
<model type='virtio'/>
<filterref filter='nova-instance-instance-0000005b-fa163e325924'>
<parameter name='DHCPSERVER' value='192.168.102.3'/>
<parameter name='IP' value='192.168.102.4'/>
<parameter name='PROJMASK' value='255.255.255.0'/>
<parameter name='PROJNET' value='192.168.102.0'/>
</filterref>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='file'>
<source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
<target port='0'/>
<alias name='serial0'/>
</serial>
<serial type='pty'>
<source path='/dev/pts/2'/>
<target port='1'/>
<alias name='serial1'/>
</serial>
<console type='file'>
<source path='/var/lib/nova/instances/instance-0000005b/console.log'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='tablet' bus='usb'>
<alias name='input0'/>
</input>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='5901' autoport='yes' listen='0.0.0.0' keymap='en-us'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='apparmor' relabel='yes'>
<label>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</label>
<imagelabel>libvirt-4bd1426d-4972-4176-8a72-bfccd3c9035b</imagelabel>
</seclabel>
</domain>
root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
root@esx2:~# virsh list
Id Name State
----------------------------------------------------
root@openstack:/etc/init.d# nova list --all-tenants
Please enter password for encrypted keyring:
+--------------------------------------+----------+--------+--------------------+
| ID | Name | Status | Networks |
+--------------------------------------+----------+--------+--------------------+
| 4bd1426d-4972-4176-8a72-bfccd3c9035b | asdasdad | ACTIVE | lan1=192.168.102.4 |
+--------------------------------------+----------+--------+--------------------+
root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
Please enter password for encrypted keyring:
+-------------------------------------+-----------------------------------------------------------+
| Property | Value |
+-------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | esx1 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | esx1.lab.elconas.de |
| OS-EXT-SRV-ATTR:instance_name | instance-0000005b |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2013-02-23T15:15:57Z |
| flavor | m1.tiny (6) |
| hostId | 8a48ee7c074e396d726b6da80eeacb9a7bae4bf41450e5b772cf9ff0 |
| id | 4bd1426d-4972-4176-8a72-bfccd3c9035b |
| image | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
| key_name | rheinzmann |
| lan1 network | 192.168.102.4 |
| metadata | {} |
| name | asdasdad |
| progress | 0 |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| tenant_id | 62515dbc834241d8ab5d58ed7ea50f6b |
| updated | 2013-02-23T15:28:47Z |
| user_id | b193e3443cd94f41bac8938f4da5a9d0 |
+-------------------------------------+-----------------------------------------------------------+
root@openstack:/etc/init.d# nova live-migration 4bd1426d-4972-4176-8a72-bfccd3c9035b esx2
Please enter password for encrypted keyring:
root@openstack:/etc/init.d# nova show 4bd1426d-4972-4176-8a72-bfccd3c9035b
Please enter password for encrypted keyring:
+-------------------------------------+-----------------------------------------------------------+
| Property | Value |
+-------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | esx2 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | esx2.lab.elconas.de |
| OS-EXT-SRV-ATTR:instance_name | instance-0000005b |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2013-02-23T15:15:57Z |
| flavor | m1.tiny (6) |
| hostId | cd6040c927390b663ce86a21e25f48228cd8dd084d05a2e826f5ac0e |
| id | 4bd1426d-4972-4176-8a72-bfccd3c9035b |
| image | Ubuntu-Image-12.04 (cf59575f-189f-4275-9e66-1cc39efb47e4) |
| key_name | rheinzmann |
| lan1 network | 192.168.102.4 |
| metadata | {} |
| name | asdasdad |
| progress | 0 |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| tenant_id | 62515dbc834241d8ab5d58ed7ea50f6b |
| updated | 2013-02-23T15:45:21Z |
| user_id | b193e3443cd94f41bac8938f4da5a9d0 |
+-------------------------------------+-----------------------------------------------------------+
root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx1:~# virsh list
Id Name State
----------------------------------------------------
2 quantum running
root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [4] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx2:~# virsh list
Id Name State
----------------------------------------------------
3 instance-0000005b running
root@esx2:~#
=> Terminate Instance in GUI ......
=> Instance Terminated and Deleted
=> Volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 beeing Available ...
Try to delete volume ff447b2f-6910-4e35-a3c0-78bffe6f35d7 in GUI ....
Now the "dd" takes place to remove the volume
After the "dd" the deletion has ended in "Error_Deleting"
root@esx1:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
tcp: [8] 172.16.0.130:3260,1 iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
root@esx2:~# iscsiadm -m session
tcp: [1] 172.16.0.110:3260,2 iqn.1986-03.com.sun:nexentaboot
root@esx2:~# cinder list --all-tenants
+--------------------------------------+----------------+--------------+------+-------------+-------------+
| ID | Status | Display Name | Size | Volume Type | Attached to |
+--------------------------------------+----------------+--------------+------+-------------+-------------+
| 332e4755-070c-45b4-b4b4-408c3fe6609b | available | mytest | 5 | None | |
| df9416af-0b35-4acf-af1b-3c2593757b67 | error_deleting | | 5 | None | |
| ff447b2f-6910-4e35-a3c0-78bffe6f35d7 | error_deleting | | 5 | None | |
+--------------------------------------+----------------+--------------+------+-------------+-------------+
On Cinder:
root@esx2:~# tgtadm --mode target --op show
...
Target 2: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
System information:
Driver: iscsi
State: ready
I_T nexus information:
I_T nexus: 6
Initiator: iqn.2010-09.org.etherboot:openstack164
Connection: 0
IP Address: 172.16.0.120
LUN information:
LUN: 0
Type: controller
SCSI ID: IET 00020000
SCSI SN: beaf20
Size: 0 MB, Block size: 1
Online: Yes
Removable media: No
Readonly: No
Backing store type: null
Backing store path: None
Backing store flags:
LUN: 1
Type: disk
SCSI ID: IET 00020001
SCSI SN: beaf21
Size: 5369 MB, Block size: 512
Online: Yes
Removable media: No
Readonly: No
Backing store type: rdwr
Backing store path: /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
Backing store flags:
Account information:
ACL information:
ALL
Left Over Session:
root@esx2:~# tgtadm --lld iscsi --mode conn --op show --tid 2
Session: 6
Connection: 0
Initiator: iqn.2010-09.org.etherboot:openstack164
IP Address: 172.16.0.120
Delete the Session
root@esx1:~# iscsiadm -m session -r 8 -u
Logging out of session [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260]
Logout of [sid: 8, target: iqn.2010-10.org.openstack:volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7, portal: 172.16.0.130,3260] successful.
root@esx2:~# tgtadm --mode target --op delete --tid 2
root@esx2:~# lvremove /dev/cinder-volumes/volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7
File descriptor 3 (/usr/share/bash-completion/completions) leaked on lvremove invocation. Parent PID 11376: -bash
Do you really want to remove active logical volume volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7? [y/n]: y
Logical volume "volume-ff447b2f-6910-4e35-a3c0-78bffe6f35d7" successfully removed
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1132146/+subscriptions