openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #15802
[nova] Disk attachment consistency
Hey Everyone,
Overview
--------
One of the things that we are striving for in nova is interface consistency, that is, we'd like someone to be able to use an openstack cloud without knowing or caring which hypervisor is running underneath. There is a nasty bit of inconsistency in the way that disks are hot attached to vms that shows through to the user. I've been debating ways to minimize this and I have some issues I need feedback on.
Background
----------
There are three issues contributing to the bad user experience of attaching volumes.
1) The api we present for attaching a volume to an instance has a parameter called device. This is presented as where to attach the disk in the guest.
2) Xen picks minor device numbers on the host hypervisor side and the guest driver follows instructions
3) KVM picks minor device numbers on the guest driver side and doesn't expose them to the host hypervisor side
Resulting Issues
----------------
a) The device name only makes sense for linux. FreeBSD will select different device names, and windows doesn't even use device names. In addition xen uses /dev/xvda and kvm uses /dev/vda
b) The device sent in kvm will not match where it actually shows up. We can consistently guess where it will show up if the guest kernel is >= 3.2, otherwise we are likely to be wrong, and it may change on a reboot anyway
Long term solutions
------------------
We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage.
The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter).
(review at https://review.openstack.org/#/c/10908/)
The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently.
(review coming soon)
First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps?
Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud?
I see two options:
a) automatically convert it to the right value and return it
b) fail with an error message
Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing:
a) let the attach go through as is.
advantages: it will allow scripts to work without having to manually find the next device.
disadvantages: the device name will never be correct in the guest
b) automatically modify the request to attach at /dev/vdc and return it
advantages: the device name will be correct some of the time (kvm guests with newer kernels)
disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change
c) fail and say, the next disk must be attached at /dev/vdc:
advantages: explicit
disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels)
The second proposal earlier will at least give us a consistent name to find the volume in all these cases, although b) means we have to check the return value to find out what that consistent location is like we do when we don't pass in a device.
I hope everything is clear, but if more explanation is needed please let me know. If anyone has alternative/better proposals please tell me. The last question I think is the most important.
Vish
Follow ups