yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89301
[Bug 1981814] [NEW] swap_volume: Maybe happen IO error or lose user data if the task failed
Public bug reported:
Description
===========
The task of swap_volume is a general and important function for instances and
in-use volumes. The whole process consists of 3 steps in nova:
* first: connect new volume to libvirt guest(instance is using old volume);
* second: copy or rebase old volume data to new volume(instance is using old volume);
* third: update volumes states in cinder and block_device_mapping in nova
(instance is using new volume);
But the exception handler is too simple: roll-back will be excuted if
any exception happened in any step and the actual volume used was ingored.
the roll-back operation is to disconnect new volume and delete new attachment.
Clearly, a exception raised in the third step, we can't do roll-back and should
continue to complete the task if the exception is not fatal. otherwise Input/Output
error will happen while user read or write the disk, and user data maybe lose if
the data write to new volume but was roll-back.
Steps to reproduce
==================
1. create an instance and attach a available volume to it:
$ openstack server create my-vm --flavor m1.medium --image <vm-image> --network <vm-network>
$ openstack volume create my-vol --type <type-1> --size 100
$ openstack server add volume my-vm my-vol
2. enter my-vm, make file system and mount /dev/vdc, then read-write the /dev/vdc
$ mkfs.ext4 /dev/vdc
$ mount /dev/vdc /mnt
$ touch /mnt/test
$ fio -rw=randrw -ioengine=libaio -bs=4K -size=20G -filename=/mnt/test ...
3. retype the volume:
$ openstack volume set my-vol --type <type-2> --retype-policy on-demand
4. Some accidents cause nova disconnect old volume failed in third step after the
second step is finished successfully, and the task finally failed.
5. fio can't read or write file /mnt/test.
Expected result
===============
After exception happened in step 4, the disk should normally read and write.
Actual result
=============
Just as step 5, user can't read and write disk.
Environment
===========
1. nova version: 22.0.1
2. hypervisor: Libvirt+Qemu
2. Storage: ceph, FC-San, LVM
3. network: Neutron + ovs
Logs & Configs
==============
** Affects: nova
Importance: Undecided
Status: Confirmed
** Changed in: nova
Status: New => Confirmed
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1981814
Title:
swap_volume: Maybe happen IO error or lose user data if the task
failed
Status in OpenStack Compute (nova):
Confirmed
Bug description:
Description
===========
The task of swap_volume is a general and important function for instances and
in-use volumes. The whole process consists of 3 steps in nova:
* first: connect new volume to libvirt guest(instance is using old volume);
* second: copy or rebase old volume data to new volume(instance is using old volume);
* third: update volumes states in cinder and block_device_mapping in nova
(instance is using new volume);
But the exception handler is too simple: roll-back will be excuted if
any exception happened in any step and the actual volume used was ingored.
the roll-back operation is to disconnect new volume and delete new attachment.
Clearly, a exception raised in the third step, we can't do roll-back and should
continue to complete the task if the exception is not fatal. otherwise Input/Output
error will happen while user read or write the disk, and user data maybe lose if
the data write to new volume but was roll-back.
Steps to reproduce
==================
1. create an instance and attach a available volume to it:
$ openstack server create my-vm --flavor m1.medium --image <vm-image> --network <vm-network>
$ openstack volume create my-vol --type <type-1> --size 100
$ openstack server add volume my-vm my-vol
2. enter my-vm, make file system and mount /dev/vdc, then read-write the /dev/vdc
$ mkfs.ext4 /dev/vdc
$ mount /dev/vdc /mnt
$ touch /mnt/test
$ fio -rw=randrw -ioengine=libaio -bs=4K -size=20G -filename=/mnt/test ...
3. retype the volume:
$ openstack volume set my-vol --type <type-2> --retype-policy on-demand
4. Some accidents cause nova disconnect old volume failed in third step after the
second step is finished successfully, and the task finally failed.
5. fio can't read or write file /mnt/test.
Expected result
===============
After exception happened in step 4, the disk should normally read and write.
Actual result
=============
Just as step 5, user can't read and write disk.
Environment
===========
1. nova version: 22.0.1
2. hypervisor: Libvirt+Qemu
2. Storage: ceph, FC-San, LVM
3. network: Neutron + ovs
Logs & Configs
==============
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1981814/+subscriptions